Feeds:
Posts
Comments

Archive for May, 2012

the read() method does not read across the block boundaries. Use readFully() instead to read until the end of a file, e.g. all data blocks of a file across the cluster can be read by readFully().

Read Full Post »

String bam_file = “/path/file.bam”;
String bcf_file = “/path/file.bcf”;
String filtered_bcf_file = “/path/file.flt.bcf”;

// “/bin/sh”,”-c”  means that run the command line as shell, so we can use it the same way as in Linux
// “/usr/local/bin/samtools mpileup -uf /data/hg19.fa “+bam_file+”|/usr/local/bin/bcftools view -bvcg – > “+bcf_file is the command line string, variables can be used in the string.
String [] cmd = {“/bin/sh”,”-c”,”/usr/local/bin/samtools mpileup -uf /data/hg19.fa “+bam_file+”|/usr/local/bin/bcftools view -bvcg – > “+bcf_file};
Process p = Runtime.getRuntime().exec(cmd);
p.waitFor();

String [] cmd2 ={“/bin/sh”,”-c”,”/usr/local/bin/bcftools view “+bcf_file+”|/usr/local/bin/vcfutils.pl varFilter -D100 > “+filtered_vcf_file};
Process p2 = Runtime.getRuntime().exec(cmd2);
p2.waitFor();

Read Full Post »

Problem statement:
I want to create an EBS backed AMI from a running instance-store instance (no EBS backed AMIs available for the specific AMI I want to use) and customized the AMI by installing software and loading some reference genomes so that I can launch a cluster with all tools and data ready when the cluster is up. And also I want to increase the root drive size so that the software and data can fit in.

Here are the step-by-step instructions:
1. launch the instance-store AMI I want to use
Image

2. create an EBS volume and attach it to the above instance from AWS console

3. login to the instance and follow the instructions below

find running services and manual kill them (except sshd)
$ service –status-all|grep run

Image

 

copy root drive to the EBS drive you attached
$ dd bs=65536 if=/dev/sda1 of=/dev/sdf

mount the EBS drive
$ mkdir /root/ebs-vol
$ mount /dev/sdf /root/ebs-vol

resize the file system of the attached EBS drive
$ resize2fs /dev/sdf
Image

4. load data you will need to the EBS drive so that your created AMI will be customized with your data

5. create a snapshot for the EBS drive, the snapshot ID will be used to register the AMI below

6. register the snapshot as an AMI, which will be ready for using (keep the above instance running while doing this step).

find the image information of the instance, which is needed to register the created AMI
$ ec2-describe-images ami-975390fe
IMAGE ami-975390fe 490429964467/hadoop-0.20.203.0-x86_64 490429964467 available public x86_64 machine aki-b51cf9dc ari-b31cf9da instance-store paravirtual xen

register AMI (-b “/dev/sdb=ephemeral0” -b “/dev/sda=snap-41ebf73d::false”  mount the ephemeral store in /mnt)
$ ec2-register  -a x86_64  -b “/dev/sdb=ephemeral0” -b “/dev/sda=snap-41ebf73d::false” -d “cai custom hadoop” -n “hadoop-0.20.203.0-x86_64.manifest.xml” –kernel aki-b51cf9dc –ramdisk ari-b31cf9da -s snap-41ebf73d

All set. The AMI will be ready to use.

Read Full Post »