Datomic Cloud Solo subscription/install failing


#1

I have been trying to create a Solo stack, getting the following error while it is creating the Compute stack. And rolling back the installation.

I have successfully created it a while back…and deleted the instance at that time.

|16:59:41 UTC-0700|CREATE_FAILED|AWS::CloudFormation::Stack|Compute|Embedded stack arn:aws:cloudformation:us-east-1:554409359773:stack/iw-dc-stack-dev-Compute-2Z4IDT94JOPZ/fa11ab20-d19e-11e8-ae45-50fae98a24fd was not successfully created: The following resource(s) failed to create: [TxLaunchConfig].

||Physical ID: arn:aws:cloudformation:us-east-1:554409359773:stack/iw-dc-stack-dev-Compute-2Z4IDT94JOPZ/fa11ab20-d19e-11e8-ae45-50fae98a24fd

||Client Request Token: Console-CreateStack-6562ff93-ba1a-49bf-9aa5-edf9dc9d5c2a|


#2

I tried again, after 3 hours. And this time it just worked!! Didn’t do anything different from the last few tries when it failed with the above message.

Thanks.


#3

if I do not have 3 hours, how should I proceed?


#4

I think it takes some time to clear all the instances/storages to be deleted by the AWS. Don’t know exactly how much time it took. Only thing I know is I retried only after 3 hours!!

Or create instances with different names…


#5

If you are launching a system using the “master” template hosted by AWS Marketplace (instead of Storage and Compute stacks separately, see: Upgrading), finding the causal error for failed CloudFormation templates can require a bit of spelunking.
You may need to switch the view in the CFT dashboard to see “failed” or “deleted” stacks:

Then select the failed stack(s) and look in the ‘Events’ tab to determine the reason they failed.

Often, if you’re re-using previous names, they will have failed because they tried to create a resource that already exists. You can find details on deleting your stack cleanly here:

https://docs.datomic.com/cloud/operation/deleting.html#deleting-storage

You may also need to search for resources that can’t be automatically cleaned up by using tag search:

https://docs.datomic.com/cloud/operation/monitoring.html#tags


#6

Also having similar issues. Seems to be related to the aws region, for US East (N. Virginia) it keeps failing for both solo & production.

For US East (Ohio) it worked.

US East (N. Virginia)

> |10:22:46 UTC+0200|DELETE_IN_PROGRESS|AWS::CloudFormation::Stack|StorageF7F305E7||
> |---|---|---|---|---|
> |10:22:23 UTC+0200|ROLLBACK_IN_PROGRESS|AWS::CloudFormation::Stack|prod-test|The following resource(s) failed to create: [StorageF7F305E7]. . Rollback requested by user.|
> |10:22:22 UTC+0200|CREATE_FAILED|AWS::CloudFormation::Stack|StorageF7F305E7|Embedded stack arn:aws:cloudformation:us-east-1:772499141725:stack/prod-test-StorageF7F305E7-1P1VLCRJG0VQ3/10ab08f0-0367-11e9-aa40-0a0b50a105f6 was not successfully created: The following resource(s) failed to create: [DhcpOptions, EnsureEc2Vpc].|
> |10:21:21 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|StorageF7F305E7|Resource creation Initiated|
> |10:21:20 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|StorageF7F305E7||
> |10:21:16 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|prod-test|User Initiated| 

US East (N. Virginia) different:

	11:13:44 UTC+0200	CREATE_FAILED	AWS::CloudFormation::Stack	testprod-StorageF7F305E7-14BMOK8Z5NJ9E	The following resource(s) failed to create: [TagStackResourceLogGroup, DhcpOptions, EnsureEc2Vpc].
11:13:43 UTC+0200	CREATE_FAILED	AWS::Logs::LogGroup	TagStackResourceLogGroup	Resource creation cancelled
11:13:43 UTC+0200	CREATE_FAILED	AWS::EC2::DHCPOptions	DhcpOptions	Resource creation cancelled
11:13:42 UTC+0200	CREATE_IN_PROGRESS	AWS::Logs::LogGroup	TagStackResourceLogGroup	Resource creation Initiated
11:13:42 UTC+0200	CREATE_FAILED	Custom::ResourceCheck	EnsureEc2Vpc	Failed to create resource. See the details in CloudWatch Log Stream: 2018/12/19/[$LATEST]73bbd461975743e9b0d10c39c3462023
11:13:42 UTC+0200	CREATE_IN_PROGRESS	Custom::ResourceCheck	EnsureEc2Vpc	Resource creation Initiated
11:13:42 UTC+0200	CREATE_IN_PROGRESS	AWS::Logs::LogGroup	TagStackResourceLogGroup	
11:13:38 UTC+0200	CREATE_IN_PROGRESS	Custom::ResourceCheck	EnsureEc2Vpc	
11:13:38 UTC+0200	CREATE_COMPLETE	AWS::Lambda::Function	TagStackResource	

Us Oregon:

|10:41:45 UTC+0200|DELETE_IN_PROGRESS|AWS::CloudFormation::Stack|Compute||
|---|---|---|---|---|
|10:41:38 UTC+0200|ROLLBACK_IN_PROGRESS|AWS::CloudFormation::Stack|prod-test|The following resource(s) failed to create: [Compute]. . Rollback requested by user.|
|10:41:38 UTC+0200|CREATE_FAILED|AWS::CloudFormation::Stack|Compute|Embedded stack arn:aws:cloudformation:us-west-2:772499141725:stack/prod-test-Compute-2IK23NQVF8M1/3fcd2850-0369-11e9-97db-0a44a01d32f4 was not successfully created: The following resource(s) failed to create: [TxAutoScalingGroup].|
|10:36:59 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|Compute|Resource creation Initiated|
|10:36:58 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|Compute||
|10:36:55 UTC+0200|CREATE_COMPLETE|AWS::CloudFormation::Stack|StorageF7F305E7||
|10:30:57 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|StorageF7F305E7|Resource creation Initiated|
|10:30:56 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|StorageF7F305E7||
|10:30:52 UTC+0200|CREATE_IN_PROGRESS|AWS::CloudFormation::Stack|prod-test|User Initiated|

#7

Did you ensure that your account is VPC only in the regions where it did not work (https://docs.datomic.com/cloud/setting-up.html#aws-account)?

Also, is it possible you’ve previously created a system with the same name in the regions where it failed? If the resources created by that system were not fully removed you may be hitting a naming conflict in trying to re-create them.


#8

@marshall VPC only was the problem.

Totally missed that bit in the setting up info.

Thank you


#9

@marshall Datomic cloud does seem to have some strange aws related “race conditions” of some kind.

Used a newer aws account with Ireland region (also got this for US region).

The cloud formation failed for both production & solo, the first time.

Attached a image with production fail. Tied again with the same app name after rollback + delete stack… A bunch of services failed: DatomicCMK, CatalogTable, FileSystem, etc…

The deleted the stack, tried production again. Same region, config etc, but with different app-name. This time everything worked.