Friday, September 5, 2014

Consulting

Consulting career development.
http://kaiwenyang.com/index.html

Sunday, March 9, 2014

File format conversion, ^M linebreak to 'normal' linebreak in a file opened in vim


 This is the command in vim
:%s/^V^M/^V^M/g
where ^V^M means type Ctrl+V, then Ctrl+M.


:%s/A/B/  Replace A by B
 g    global

Prior to run blast search, we need to reformat file containing accession numbers copied from excel or word documents.

Friday, October 26, 2012

"such as" と "like" の違い

意外と知らない人、多いのではないでしょうか?論文を書く上で例を挙げることは多々あります。"such like" と "such as"は頻繁に使われますが、意味に明確な違いがあり、用法によっては大きな誤解を招く恐れがあります。

以下の文章を見てみましょう。

LIKEの例文
Can you buy me some fruit like apples or orange?
リンゴかミカンみたいな果物を買ってきてくれない?

この文章で話し手は、”リンゴとミカンはいらない”と言っているのです。そのかわり、”リンゴとミカンに近いもの”が欲しいのです。

SUCH ASの例文
Can you buy me some fruit such as apples or watermelon?
リンゴかミカン(を例に挙げて)、それか、それに近いものを買ってきてくれない?


まとめると、likeを使用した場合、like以下は含まれないとうことになります。しかし、such asにはそのような意味合いが含まれない。リンゴかミカンを食べたい時は、such asを使いましょう。

以上、GMATのHPを参照
http://gmat-grammar.blogspot.com/2006/06/like-vs-such-as.html

Sunday, October 21, 2012

Ocean Cleaning Day

There are a lot of plastic trash and debris in the ocean.  On the day of Ocean Cleaning Day, we went to local shoreline to join cleaning plastic wastes along the shoreline in Richmond.  It looks clean from the top, but there are a lot of pieces of breaking down plastics when you look carefully among rocks. Those are mostly bottle caps, plastic forks and styrofoam.  It breaks down into small particles and aquatic organisms ingest them accidentally.  My wife and I are trying to minimize plastic waste, but there are a lot around us and not easy to eliminate from our life.  I wonder how people purchased food when plastics were not commonly used for packaging...






Thursday, October 11, 2012

学生指導

Yahooプレジデントより抜粋
”そもそもトヨタには「自分を凌駕する部下を育てよ」(豊田英二元社長)という考えが脈づいている。社訓であるトヨタウェイ2001の「行動基準」にも「部下があなたに挑戦して、あなたの作った業務プロセスを改善するような風土を作ってください」とある。”


これはすごい考え方です。心に留めて、学生さんの指導に当たることを思った一日でした。

Thursday, September 13, 2012

力不足

力不足を感じる今日この頃。おそらく自分の英語力&経験不足からきているもの。

1:現在面倒見ている学生さんが選考試験をパスしなかった。自分にもう少し経験と知識があれば、違う結果になっていたのかもしれない。適切なアドバイスをすることができたのかもしれない。こちらで学位を取っていないため、助けるに助けられない。

対策:ともかく話をよく聞き、同じ状況を繰り返さないように心がけること。あと、どこまで面倒を見るべきなのかも考えること。

2:論文のリジェクト。

対策:ほとんどの批評は対処することが可能。あまりへこまず、次の雑誌に投稿しよう。あと、"とりあえずトップジャーナルに投稿する"という姿勢はやめた方が良さげ。時間がもったいない。精神的にも良くない。雑誌を決める際に、これまでにアクセプトされている論文の質および量を見ること。自分の見たものを信じること。

立て続けにショックなことがあり、少々弱っていると認識している。明日はがんばろう!

Tuesday, September 11, 2012

Stand alone Blast Search for Mac OSX


Quick instruction for setting up standalone BLAST search
MAC OS 10.6.8, Terminal "bash"

1. Go to NCBI website (Note: Safari has problem with downloading database files, use FireFox instead)

2. Click "NCBI FTP Site".  You can find the link at bottom right on the page. (Note: select "Guest" in case you are asked for user ID and password)



3. Go to “BLAST Basic Local Alignment Search Tool” and then click the link.  You will see list of files and folders.

4. Open a folder "executables", go to "LATEST", and then double click "ncbi-blast-2.2.26+-universal-macosx.tar.gz" or latest version.  Download starts automatically.  You can find the blast folder in the folder “Downloads”

5. Rename the downloaded folder to "blast", and then move it into "Home Directly" (the one with the icon of house, where folders “Desktop”, “Documents”, “Downloads” are stored).


6. Before setting up stand alone blast, let’s review some basic commands in Linux

Start "Terminal" and then try following command:

Note: You can find the program “Terminal” in “Utilities” folder in “Applications”

Show your current directly (folder)
pwd

List all the files in a directly (folder) (hidden files included)
ls -a

Move to a different folder (change directly)
cd [folder name]
cd [folder]/[folder]

Back to Home Directly
cd

Create or edit a file
vi [file_name]
example) vi .bash_profile
type "a" to enter "edit mode"
hit "esc" to exit edit mode
hit "ZZ" (shift+z twice) to save and back to normal window

Delete an existing file (remove)
rm [file_name]
example) rm .bash_profile

Look content of a file
more [file_name]
example) more output.txt
(hit "q" to quit the command)

7.
Create .bash_profile (Look Step 6) in your “Home Directory”.  Here are commands to create a new file for setting a PATH.


vi .bash_profile [hit enter]  (this creates a file named “.bash_profile”)

[hit "a"]  (this is to enter edit mode)
export PATH=/Users/[name of your home directly]/blast/bin:$PATH


[hit "esc"] (exit edit mode)

[hit "ZZ"]  (save file and then back to terminal)

Restart Terminal (this is needed to activate the path).

Note: Why we need to set PATH?
The program "Terminal" looks for command programs stored in certain folders (folders can be visualized by typing "$PATH").  All the commands listed above (ls, more) are stored as program files in one of the folders listed below.

/Users/tomokurobe/blast/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin: No such file or directory

This means...
Terminal looks following folders to find programs:
/Users/tomokurobe/blast/bin
/usr/bin
/bin
/usr/sbin
/sbin
/usr/local/bin
/usr/X11/bin
or  return a message "No such file or directory"

In the default setting, "Terminal" doesn't look the newly added folder "blast", so that we need to set PATH.  Otherwise Terminal cannot find blast programs stored in the new folder.  In worst case, we can move blast programs in one of the folders above, but not recommended as you will be asked for password every time you make change.  Those folders contain important files for running OSX so it’s designed that users cannot make changes easily.

8.
8-1) Go to the NCBI FTP site and download database file in FASTA format (double click fasta file you want to use).  For test purpose, choose small database like "yeast.aa". "nr" database is large, even the compressed file is bigger than 10 Gb.  It takes very long time to run search program.  For this tutorial, you can download fasta file for “yeast.aa”


8-2) Now fasta file is downloaded in folder “Downloads”.  Create a new folder "db" in "blast" folder and then move the FASTA file “yeast.aa” in it.


8-3) In Terminal, type “pwd” to see your current location.  You should be in your Home Directly.  Type “ls” and make sure that there is a folder “blast”.  Type "cd blast/db" to reach the database folder.  Type “ls” to see there is the fasta file, “yeast.aa”.


8-4) To format database, type following command
makeblastdb -in yeast.aa -dbtype prot -parse_seqids (-parse_seqids is not necessary!)



This command creates 8 files (.phr .pin .pnd .pni .pog .psd .psi .psq).  All the files are needed for running blast search.


9. Run blast search

Note: we don't have to specify the database file created in the previous step.  Simply type database name and don't add any file extension.  Here is the instruction.

9-1) Prepare test sequence(s) in fasta format and then name it “test.fasta”

9-2) Stay in the "db" folder and then type in the following command:

blastp -query test.fasta -db yeast.aa -out output.txt
(Note: this command creates output file “output.txt” and you see result in regular output format, 50 hits)

more test.txt (to look your result)


blastp -query test.fasta -db yeast.aa -out output.txt -max_target_seqs 1  -outfmt '7 std sseqid sgi'
(Note: tubular format with comments, 1 hit.  -outfmt ‘[number] [std(standard)] [other info you want to add]’)

blastp -query test.fasta -db yeast.aa -out output.txt -max_target_seqs 1  -outfmt '6 sgi'
(Note: tubular format without comments, 1 hit)

Important Note: Comment from NCBI staff "Be careful using -max_target_seqs 1. You could miss the best hit. I would test with a higher number, like 50, to be sure setting to 1 gives the top hit with your query and db"

10. Retrieve definition of gene (gene name). This can be done just adding stitle in outfmt option. 

10. Retrieve definition of gene (gene name) using blastdbcmd program
Note: the blast search programs in tabular format (-outfmt 5, 6 or 10) doesn't output definition of gene (gene name).  To retrieve definition of gene, we need to use blastdbcmd program.  I saw some people using Bio::Perl to do the task, but it's not necessary.

10-1) Prepare file containing only ‘sgi’ (subject gene id).  The outformat option, -outfmt ‘6 sgi’ create what you need.

blastdbcmd -entry_batch output.txt -out result.txt -db yeast.aa -outfmt %t

Your result will be exported in the file "result.txt"