Ion deployment failure

I’m deploying an ion and it is failing. Where is the first place to start looking for the smoking gun? Cloudwatch, step functions, code deploy, or other?

Digging further into the failed script just shows:

[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503

Hi @jarrod, What is the status of your deploy when you’re seeing this? Are you running solo/production and what CFT version? Generally, I recommend that you start by reviewing CloudWatch (for alerts and messages), but this looks symptomatic of a timeout in loading your deps or your app at the “validateService” step. I’d be curious what changed between deployments that worked and this one.

FYI

Ion monitoring docs:
https://docs.datomic.com/cloud/ions/ions-monitoring.html#local-workflow

Ion troubleshooting docs with known errors:
https://docs.datomic.com/cloud/troubleshooting.html#troubleshooting-ions

Thanks for the references @jaret, I have looked over those and found them helpful, though not for my particular issue. I did find that it was a dependency that was causing the issue - I was hoping for a more explicit log or error that would indicate which dependency.

I am not quite sure where to find the CFT version. I am running a solo topology.

Jarrod,

The CFT Version can be found in the “Outputs” tab of the CloudFormation stack console.

As far as identifying dependency issues - when you run the Ion push operation you should see an output like:

{:rev "8baf1c47e0bb62faf68c76cf7fefa05635f2ed01",
 :uname "mt-ion-test",
 :deploy-groups (mt-test-solo),
 :dependency-conflicts
 {:deps
  {commons-codec/commons-codec #:mvn{:version "1.10"},
   com.cognitect/http-client #:mvn{:version "0.1.80"},
   org.slf4j/slf4j-api #:mvn{:version "1.7.14"},
   org.clojure/core.async #:mvn{:version "0.3.442"}},
  :doc
  "The :push operation overrode these dependencies to match versions already running in Datomic Cloud. To test locally, add these explicit deps to your deps.edn."},

The first step I would take would be to test your Ion locally with any reported deps from that response explicitly included in your local deps.edn file.

-Marshall

I have a similar problem, except the error code is ScriptTimedOut. The problem began when I changed my compute stack from solo to production. I’ve tried all day long and it keeps timing out at the ValidateService step. The logs just show “[stdout]Received 000” which doesn’t mean much to me.

Update: it actually happens now when I use the solo compute stack. So this may be related to the updated templates and/or updated ion dependencies, because other than that no code has been changed.

I narrowed it down to one problematic dependency: leiningen. I was using leiningen as a library, and for some reason it didn’t like that. Luckily I was using it for a pretty narrow purpose so I was able to remove it entirely, and now the deploys work for me. I basically had to use the process of elimination to figure it out. Definitely would have been nice to see the cause the of timeout, but I guess AWS doesn’t provide that.

I ran into the same problem.

For me, it was a missing lib (Apache Commons IO).

I looked in the CloudWatch logs for datomic-<system>, based on post Where to look in AWS when deploy (not push) fails?, and found compiler exception because of a missing library:

{
    "Msg": "LoadIonsFailed",
    "Ex": {
        "Via": [
            {
                "Type": "clojure.lang.Compiler$CompilerException",
                "Message": "Syntax error compiling at (ring/middleware/multipart_params.clj:1:1).",
                "Data": {
                    "ClojureErrorPhase": "CompileSyntaxCheck",
                    "ClojureErrorLine": 1,
                    "ClojureErrorColumn": 1,
                    "ClojureErrorSource": "ring/middleware/multipart_params.clj"
                },
                "At": [
                    "clojure.lang.Compiler",
                    "load",
                    "Compiler.java",
                    7647
                ]
            },
            {
                "Type": "java.lang.ClassNotFoundException",
                "Message": "org.apache.commons.io.IOUtils",
                "At": [
                    "java.net.URLClassLoader",
                    "findClass",
                    "URLClassLoader.java",
                    382
                ]
            }
        ],

...