Ion deployment failure

jarrod · October 30, 2018, 9:25pm

I’m deploying an ion and it is failing. Where is the first place to start looking for the smoking gun? Cloudwatch, step functions, code deploy, or other?

Digging further into the failed script just shows:

[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503
[stdout]Received 503

jaret · November 1, 2018, 6:43pm

Hi @jarrod, What is the status of your deploy when you’re seeing this? Are you running solo/production and what CFT version? Generally, I recommend that you start by reviewing CloudWatch (for alerts and messages), but this looks symptomatic of a timeout in loading your deps or your app at the “validateService” step. I’d be curious what changed between deployments that worked and this one.

FYI

Ion monitoring docs:
https://docs.datomic.com/cloud/ions/ions-monitoring.html#local-workflow

Ion troubleshooting docs with known errors:
https://docs.datomic.com/cloud/troubleshooting.html#troubleshooting-ions

jarrod · November 5, 2018, 7:27pm

Thanks for the references @jaret, I have looked over those and found them helpful, though not for my particular issue. I did find that it was a dependency that was causing the issue - I was hoping for a more explicit log or error that would indicate which dependency.

I am not quite sure where to find the CFT version. I am running a solo topology.

marshall · November 8, 2018, 3:45pm

Jarrod,

The CFT Version can be found in the “Outputs” tab of the CloudFormation stack console.

As far as identifying dependency issues - when you run the Ion push operation you should see an output like:

{:rev "8baf1c47e0bb62faf68c76cf7fefa05635f2ed01",
 :uname "mt-ion-test",
 :deploy-groups (mt-test-solo),
 :dependency-conflicts
 {:deps
  {commons-codec/commons-codec #:mvn{:version "1.10"},
   com.cognitect/http-client #:mvn{:version "0.1.80"},
   org.slf4j/slf4j-api #:mvn{:version "1.7.14"},
   org.clojure/core.async #:mvn{:version "0.3.442"}},
  :doc
  "The :push operation overrode these dependencies to match versions already running in Datomic Cloud. To test locally, add these explicit deps to your deps.edn."},

The first step I would take would be to test your Ion locally with any reported deps from that response explicitly included in your local deps.edn file.

-Marshall

sekao · May 26, 2019, 5:18pm

I have a similar problem, except the error code is ScriptTimedOut. The problem began when I changed my compute stack from solo to production. I’ve tried all day long and it keeps timing out at the ValidateService step. The logs just show “[stdout]Received 000” which doesn’t mean much to me.

Update: it actually happens now when I use the solo compute stack. So this may be related to the updated templates and/or updated ion dependencies, because other than that no code has been changed.

sekao · May 27, 2019, 1:53am

I narrowed it down to one problematic dependency: leiningen. I was using leiningen as a library, and for some reason it didn’t like that. Luckily I was using it for a pretty narrow purpose so I was able to remove it entirely, and now the deploys work for me. I basically had to use the process of elimination to figure it out. Definitely would have been nice to see the cause the of timeout, but I guess AWS doesn’t provide that.

ckws · May 28, 2019, 1:13pm

I ran into the same problem.

For me, it was a missing lib (Apache Commons IO).

I looked in the CloudWatch logs for datomic-<system>, based on post Where to look in AWS when deploy (not push) fails?, and found compiler exception because of a missing library:

{
    "Msg": "LoadIonsFailed",
    "Ex": {
        "Via": [
            {
                "Type": "clojure.lang.Compiler$CompilerException",
                "Message": "Syntax error compiling at (ring/middleware/multipart_params.clj:1:1).",
                "Data": {
                    "ClojureErrorPhase": "CompileSyntaxCheck",
                    "ClojureErrorLine": 1,
                    "ClojureErrorColumn": 1,
                    "ClojureErrorSource": "ring/middleware/multipart_params.clj"
                },
                "At": [
                    "clojure.lang.Compiler",
                    "load",
                    "Compiler.java",
                    7647
                ]
            },
            {
                "Type": "java.lang.ClassNotFoundException",
                "Message": "org.apache.commons.io.IOUtils",
                "At": [
                    "java.net.URLClassLoader",
                    "findClass",
                    "URLClassLoader.java",
                    382
                ]
            }
        ],

...

Topic		Replies	Views
Unable to push ion since yesterday Troubleshooting	4	1531	August 8, 2018
ION Deploy Failing on Solo Troubleshooting	5	2418	July 9, 2021
Troubleshooting :deploy-status "FAILED", :code-deploy-status "SUCCEEDED" Troubleshooting	3	518	March 10, 2022
Ions Silent Timeout Failure Troubleshooting	2	789	May 26, 2020
Where to look in AWS when deploy (not push) fails? Troubleshooting	9	3263	August 1, 2023

Ion deployment failure

Related topics