Sam’s Notebook: Mox – Password-less SSH!

0000-0002-2747-368X

The high performance computing (HPC) cluster (called Mox) at Univ. of Washington (UW) frustratingly requires a password when SSH-ing, even when SSH keys are in use. I have a lengthy, unintelligable password that I use for my UW account, so having to type this in any time I want to initiate a new SSH session on Mox is a painful process.

Today, I finally got fed up with how much time I was wasting (granted, it’s minor in the grand scheme of my day) just logging in to Mox, so I spent some time figuring out how to automate password entry for a new SSH session with Mox.

I tried to handle this using the program sshpass, but I couldn’t get it to read my password from a file – it would just hang in limbo after executing the command.

In the end, I came across a bash script that does this perfectly. Steps to implement this on Ubuntu 16.04 LTS:

  1. Install expect:
    sudo apt install expect
  2. Create following script (taken from this [StackExchange solution])(https://ift.tt/2LaGayC
     #!/usr/bin/expect spawn ssh mox expect "Password:" send "\r" interact 

    NOTES:

    • I have an ~/.ssh/config file that allows me to use “mox” as an alias for my full SSH command
    • Replace with your own UW password.
  3. Change access to script (set read, write, execute for user only):
    chmod u=rwx,go-rwx
  4. Run script from home directory (saved in home directory):
    ./mox.sh

Boom! No having to track down password, copy, and paste!

from Sam’s Notebook https://ift.tt/2L49Tw0
via IFTTT

Sam’s Notebook: Ubuntu – Fix “No Video Signal” Issue on Emu/Roadrunner

0000-0002-2747-368X

Both Apple Xserves (Emu/Roadrunner) running Ubuntu (16.04LTS) experienced the same issue – the monitor would indicate “No Video Signal”, would go dark, and wasn’t responsive to keyboard/mouse movements. However, you could ssh into both machines w/o issue.

Although having these machines be “headless” (i.e. with no display) is usually fine, it’s not ideal for a couple of reasons:

  1. Difficult to use for other lab members who aren’t as familiar with SSH – specifically if they would want to use a Jupyter Notebook remotely (this would require setting up a tunnel to their own computer).
  2. Can’t use Remmina Remote Desktop until a user has physically logged in from the Ubuntu login screen at least once, in order to launch Remmina.

The second aspect was the major impetus in me finally being motivated to deal with this. Accessing these computers via remote desktop is much easier to manage long-running Jupyter Notebooks instead of relying on an SSH tunnel. The tunnel greatly limits my access to the Jupyter Notebook outside of the computer that has the tunnel set up.

Well, this led me down a horrible rabbit hole of Linux stuff that I won’t get fully in to (particularly, since I didn’t understand most of it and can’t remember all the crazy stuff I read/tried).

However, here’s the gist:

  1. Needed to edit /etc/default/grub
  2. After editing, needed to update grub config file: sudo update-grub

Despite the fact that both machines are (or, should be) identical, I did not get the same results. The edits I made to the /etc/default/grub file on Emu worked immediately. The edits were:

  1. Add nomodeset to this (this is the edited line) line (this seemed to be the most common suggestion for fixing the “No Video Signal” issue):

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset"

  1. Comment out this line (this line was triggering an error/warning about writing the config file when running the update-grub command):

#GRUB_HIDDEN_TIMEOUT=0

For some reason, Roadrunner did not take kindly to those changes and it took a long time to resolve, ending with changing permissions on ~/.Xauthority back to their original permissions (they got altered when I ran some command – sudo startx or something) to get out of a login loop.

Regardless, both are fixed, both can be used when physically sitting at the computer, and both can be accessed remotely using Remmina!

from Sam’s Notebook https://ift.tt/2KTjtBT
via IFTTT

[code]-> Done reading data waiting...

-> Done reading data waiting for calculations to finish
	-> Done waiting for threads
	-> Output filenames:
		->"Genotypes_parentage.arg"
		->"Genotypes_parentage.mafs.gz"
		->"Genotypes_parentage.geno.gz"
	-> Fri Jul  6 23:37:52 2018
	-> Arguments and parameters for all analysis are located in .arg file
	-> Total number of sites analyzed: 786914862
	-> Number of sites retained after filetering: 696 
	[ALL done] cpu-time used =  240010.84 sec
	[ALL done] walltime used =  128473.00 sec

Grace’s Notebook: Notes from Crab Meeting

Today we had our 4th crab meeting and discussed our short-term sequencing plan for 3 libraries (1: day 9 uninfected; 2: day 9 infected; and 3:a “masterpool” from the reamining 10 treatments). We also discussed our plan going forward with qPCR and creating libraries later and we hopefully will see some cool things with the three current chosen libraries and the qPCR.

I have the meeting recorded, just have to edit and publish it as DecaPod S1E10.

We are going to go ahead with the three libraries proposed in my previous notebook post. Sam is going to check around at the UW CORE facilities available to us. We have to use a UW facility due to budget restrictions (Pam would have to re-negotiate if we wanted to use something else, like Genewiz).

I (with Sam’s assistance and insight) will create the pooled samples such that we will have a tube for each library (3). We will then use the Speed Vac to evaporate off some liquid in order to get a specific concentration (TBD- depends on what the CORE facility requires).

Sequencing takes some time, so while we are waiting for the results to come back, I will compile databases of genomic resources. Namely, find fasta files for those closest related to Chionoecetes bairdi and Hematodinium spp. and create databses. I will also practice using Trinity and BLAST with some geoduck data that we have already so that once the RNAseq data from our crabs comes back, we’ll already have a good idea on how to execute the bioinformatic pipeline.

Once we analyze the data and pick out some genes, we will make primers and use qPCR on individuals. If we see anything that we’d like to look at more closely, we can create new libraries of individuals or pools. I also may extract more RNA since I currently have only extracted RNA from <~50% of the surviving crabs (113 crabs survived the experiment and I have extracted RNA from 51 of them).

I will be learning a lot this summer and I am really excited! Reading and praciticing occasionally doesn’t stick with me as much as actually doing things, so practicing with real data will be very helpful. Pam is also interested in learning more as well, so working with her and potentially teaching her what I learn will further enrich my understanding. Looking forward to it!

from Grace’s Lab Notebook https://ift.tt/2u7HsTz
via IFTTT

[code]#!/bin/bash ## Job Name #SBATCH...

#!/bin/bash
## Job Name
#SBATCH --job-name=angsd-05
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=10-100:00:00
## Memory per node
#SBATCH --mem=100G
#SBATCH --mail-type=ALL
#SBATCH --mail-user=sr320@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sr320/analyses/0705





source /gscratch/srlab/programs/scripts/paths.sh


/gscratch/srlab/sr320/programs/angsd/angsd \
-bam /gscratch/srlab/sr320/data/cw/all_bam.bamlist \
-out Genotypes_parentage \
-GL 1 \
-doMaf 1 \
-doMajorMinor 1 \
-minMaf 0.25 \
-SNP_pval 1e-6 \
-minInd 525 \
-minQ 20 \
-P 28 \
-doGeno 2 \
-doPost 1 \
-postCutoff 0.95 \
-doCounts 1 \
-geno_minDepth 7

#sbatch

Sam’s Notebook: Transposable Element Mapping – Olympia Oyster Genome Assembly, Olurida_v081, using RepeatMasker 4.07

0000-0002-2747-368X

I previously performed this analysis using a different version of our Ostrea lurida genome assembly. Steven asked that I repeat the analysis with a modified version of the genome assembly (Olurida_v081) – only has contigs >1000bp in length.

Genome used: Olurida_v081

I ran RepeatMasker (v4.07) with RepBase-20170127 and RMBlast 2.6.0 four times:

  1. Default settings (i.e. no species select – will use human genome).
  2. Species = Crassostrea gigas (Pacific oyster)
  3. Species = Crassostrea virginica (Eastern oyster)
  4. Species = Ostrea lurida (Olympia oyster)

The idea was to get a sense of how the analyses would differ with species specifications. However, it’s likely that the only species setting that will make any difference will be Run #2 (Crassostrea gigas).

The reason I say this is that RepeatMasker has a built in tool to query which species are available in the RepBase database (e.g.):

 RepeatMasker-4.0.7/util/queryRepeatDatabase.pl -species "crassostrea virginica" -stat 

Here’s a very brief overview of what that yields:

  • Crassotrea gigas: 792 specific repeats
  • Crassostrea virginica: 4 Crassostrea virginica specific repeats
  • Ostrea lurida: 0 Ostrea lurida specific repeats

All runs were performed on roadrunner.

All commands were documented in a Jupyter Notebook (GitHub):

NOTE: RepeatMasker writes the desired output files (*.out, *.cat.gz, and *.gff) to the same directory that the genome is located in! If you conduct multiple runs with the same genome in the same directory, it will overwrite those files, as they are named using the genome assembly filename.

New photo at 47.81919166666667, -122.88106666666667

8BBCFF6F-9EA7-4C18-BBDE-3FAB399B10CE?sharing_key=2e55582d15ee61097993eda88827e74a

July 03, 2018 at 11:46AM
via iOS Photos https://ift.tt/2KEcKM3

downloading subset of files from web

wget -r --no-parent -A 'CP*.gz' http://owl.fish.washington.edu/nightingales/O_lurida/

Grace’s Notebook: New Crab Pooling Plan for RNA Seq

Today I met with Sam and Steven a couple times to figure out what we’re going to discuss this Thursday with Pam. Steven and Sam have come up with three pools that we feel are a good place to start because unfortunately there’s not enough RNA to do the original pooling plan. This post details all the info that I currently have on the plan for our discussion on Thursday.

Links to spreadsheets:
RAW csv (mainsheet): 20180702-crab-sampling-file.csv
Excel spreadhseet (tabs for each pool): 20180702-crab-sampling-file.xls

New plan:
Start out with 3 pools:

  • Pool #1 Uninfected from Day 9
  • Pool #2 Infected from Day 9
  • MasterPool: The highest RNAng sample from each of the remaining 10 (Pools 3 – 12) pools

This plan will not be able to take temperature into account, but it will allow for gene discovery between the infected and uninfected (Pools 1 and 2). And gene discovery in the MasterPool.

After these pools are sequenced and some targets are ID’ed, we can go in and do qPCR and then plan for some more libraries based on those results.

Below are the details on how I am going to do the pools (very much subject to change as I get input from others):

Library #1 – Uninfected

img

Library #2 – Infected

img

MasterPool Library

img

New Pool Plan in the big spreadsheet:

Pink: Library 1
Blue: Library 2
Green: MasterPool
img

Next Steps

If this all looks good, then we can move on to using the Speed Vac and then send them to get sequenced.

from Grace’s Lab Notebook https://ift.tt/2NiEvZb
via IFTTT