Where to look in AWS when deploy (not push) fails?

So, after pushing then running the deploy-status command for about 2 minutes, I got the dreaded {:deploy-status "FAILED", :code-deploy-status "FAILED"}.

Where can I go from there? I’d look in Cloudwatch logs, but there are more than 30 log groups created by the stack, so it’s a bit of a needle in a haystack.

I went to see in CodeDeploy, and all I found there was

The overall deployment failed because too many individual instances failed deployment, too few healthy instances are available for deployment, or some instances in your deployment group are experiencing problems

as well as the event:

Error code: ScriptFailed

Script name: scripts/deploy-validate

Message: Script at specified location: scripts/deploy-validate run as user datomic failed with exit code 1 

Logs:

[stdout]Received 503
[stdout]Received 503
// [... a few dozens of the same elided]
[stdout]Received 503
[stdout]WARN: validation did not succeed after two minutes

This does not seem to be related to the application code, as deploying a revision that previously worked fails similarly.

Generally speaking, a checklist for troubleshooting Ions deployment (or more detailed error messages) would be appreciated.

Cheers,

Hi Val,

You should look at the specific log group for your Datomic system. It is named datomic-<your-system-name>

The details of the code failure in your ion deployment should appear in that group.

We intend to improve the clarity of naming of the other (internal) log groups going forward.

2 Likes

I believe this answer is out of date. My guess is that the naming of internal groups has been changed and that the deployment logs no longer go in datomic-<your-system-name>. I see groups with CreateCodeDeploy, CreateLambdaFromArray, GetDeploymentStatus, and others in the name. I imagine one of those will have it.

I see these in “CloudWatch -> Logs -> Log groups”.

If anyone knows more precisely where this information is, I’d appreciate the help because I cannot find it either.

I actually found an error message in “CodeDeploy -> Deployments” and the deployment that was in progress. And it was a strange one, probably good for starting another topic. AWS displayed the error right at the top of the deployment information page “memory allocation error” for the rm command! I reset the instances in the deployment autoscaling group and everything works fine.

@jzwolak , we have seen this same problem twice in the past week. Did you ever figure out the root cause and a resolution?

Forgot to mention that this “memory allocation error” only started happening after we updated our Datomic Cloud stacks to the latest (free) version. ← @jaret

@cch1 , I’m not sure what you mean by “free”. Datomic changed it’s license in April 2023 such that all Datomic versions are free. The AWS resources however, have never been free for any version of Datomic.

I never figured out what caused the memory allocation problem. I have not seen it since and have deployed many times. Perhaps something was fixed or I upgraded to a more recent version. I’m confident I am not running the latest version at the time of this writing.

Things have been working for me for some time now (including new deployments) without a problem.

I am referring to the Datomic Cloud CF stacks that implement the “free” (of Cognitect charges) version.

1 Like

Are you still seeing this? Could I review your logs? What size instance are you running?

I’d love to open a case to track this down:

https://support.cognitect.com/hc/en-us/requests/new

Hi Jaret,
We upgraded our instance size from the smallest to the next-to-smallest (t3.medium) and that seems to have resolved the issue. Prior to the upgrade we were running our development environment and our production environment on the same sized instances and only having the problem in production. Worth noting that that we have no users live yet so the load was minimal in both cases. The one notable difference: I had detailed metrics enabled in prod but not in dev.