Wanted to do a quick walkthrough of starting a new AWS EC2 instance and uploading data. A little more context: I’ve been looking at exporting the Pecan analysis to something with a whole lot more RAM, and it looks like AWS might be a decent idea.
First, sign up for AWS. I used https://aws.amazon.com/education/awseducate/
Then og in and answer all of the questions. Pretty standard stuff, who you are, where you’re from, why should we let you use our computers. etc
Navigate to the EC2 Dashboard. It should look about like this. The link to the Dashboard itself is in the upper left corner:
We want to start a new instance, so click the Launch New Instance button.
We want to use a Starcluster AMI, which is not part of the “default” set of images, so we need to go to the Community AMIs option, found on the left side in the middleish of the window. Search for Starcluster there.
I choose the option “CosmoBox-StarCluster-2015-5-28”. It uses an older version of Ubuntu, 14.04 vs the 16.04 we use on Emu and Roadrunner, but hopefully this won’t be an issue.
Next we get to choose our instance. This is where money starts becoming a consideration. Ideally, we’d be playing with a Memory Optimized r4.8xlarge with 32vCores and 244gb of ram, but amazon wants money for that, so we’re building a toy instance on t2.micro instance with 1vCore and 1gb of memory.
Next we can configure the instance, this window allows us to do two main things, launch multiple cloned instances, if we want to do multiple Pecan sample sets concurrently and set up Spot price bidding. My understanding is that AWS has two pricing schedules, On-Demand and Spot. On Demand is just like it sounds, you get it when you want it, but at a higher cost. Spot allows you to bid a specific amount, say $0.40/CPU hour, and your instance won’t launch until the floating spot price reaches that value. Since we’re playing on the free tier, and only need one instance we just click next here.
Next we can add storage. The free tier of EC2 defaults to 8gb of space, but is allowed up to 30gb. I’m greedy, so I pick 30.
After hitting next, we have the option to add tags to our instance. This is, as far as I can tell, for organizational purposes only.
Finally, it wants us to set firewall settings for access to the instance. Since this is a toy instance, I’m not playing with this.
After hitting next we’re invited to review the settings we chose, and launch the instance.
We’re asked to make a public/private key pair after hitting launch. This is important as it operates as the password for accessing your instance via ssh. Don’t lose the key it gives you via downloading a text file. Losing the text file with your private key is akin to losing the key to your house, only there are no digital locksmiths.
After downloading the key, hit launch. First you’ll see a launch status page, click the view instances button on the bottom right of the page to go to the main instances page. You’ll see the instance state is “Pending”, which means EC2 is starting your instance. After a minute or so, this will change to running.
Hooray. Your instance is live.
Access is similar to SSHing in to any Roberts Lab computer, only you have to provide that private key. Select your instance and then click connect to find out what you have to do to connect to your instance.
If you fail to do the chmod 400 step on your private key, you’ll see this error.
After changing the permissions I encounters an odd thing on AWS’s end. I copied the ssh command from the example in the Connect page, and it’s unhappy with that, wanting me to log in as user “ubuntu” rather than “root”. This is safer anyway, as you can really mess stuff up running as root always.
After changing that, I try and log back in and am greeted with a nice splash screen, showing some information about the instance, what comes pre-loaded, etc. From this point on it operates just like any ssh instance.
Coming next: Copying files, installing Pecan, and testing Open Grid Engine(OGE) vs Sun Grid Engine(SGE)
Unrelated to EC2 stuff, I’ve been working on methpipe stuff for Hollie and am running out of space on my laptop so I decided to move a bunch of intermediate files over to Owl for storage. It makes me miss having ethernet access to the UW network for transferring huge files. This is slow.