Saturday, November 18, 2017

Some tips from my job hunt for a tenure-track assistant professor position

Note: As you are reading this, I assume that you are familiar with some of the common books and articles to read when applying for a tenure-track faculty position in the United States. I have listed some of such references at the end of this article.

I started applying for tenure-track faculty positions in October (2016) and received the first onsite interview offer in November (the very next month). That sounds very exciting. However, after I was turned down at this university, it took me next five months to receive another onsite invitation. For those five months, I was struggling like a butterfly struggling to liberate itself from its cocoon. I felt like all I needed in life was a job, any job. I was stressed and anxious most of the time. However, that was when I learned the most also. Finally, two things helped me find the job I dreamt – being persistent myself, and encouragement from my wife and friends. I had a plan to write an article putting together everything I had learned about job search, but because of time limitations, I am unable to do so. However, I decided to put together some bullet points of my notes instead of losing them eventually.

Preparing application package:
  • One advertisement was not quite clear and so I emailed the search committee chair asking some questions. He replied me and informed me that there were two more positions open and suggested me to consider those openings as well. I did not know those openings. I happily applied to all three positions. It does not hurt to ask questions before applying.
  • A few positions I applied were far from my area of research. I thought I would never hear back from these universities. At some of these places that is exactly what happened – they never contacted me. But, two of these universities contacted me. One university was close to offering me the position. The other place finally offered me the position and the offer was on my list of top choices. It does not hurt to apply more.
  • One university had advertised that the position was 35% scholarly activity. During the onsite interview, I found out that it was a 100% teaching position. All faculties in the department had 4/4 teaching load. Clearly, on my end, I had not done sufficient research about this university. Someone who wrote the advertisement had also not done a good job. What I found going to the campus, could easily be found during the phone interview.
  • In job application package and during interviews I found that it is important to be generous in listing the courses I could teach. It is true that some courses might require a lot of preparation. But having too few courses in the ‘list of courses I can teach’ decreases chances of getting selected in the initial rounds of interviews.
Phone/Skype interview related:
  • It is quite common to find errors in your emails only after you have sent it. To minimize such errors, I installed a Google Chrome extension called 'Select and Speak', which I used to read aloud my emails before sending them. Some important emails are worth printing and proofreading before sending out.
  • Questions that relate to the mission and vision of the university or the department are, I found, among the difficult ones. At some places, the interviewers directly asked me how I saw myself contributing to the mission and vision of the university. I rambled and did not provide a clear answer. Preparation is the only way to answer such questions.
  • During many phone interviews, I found that a lot of the committee members had not thoroughly read my CV. Importantly, during on-campus interviews, I found that a lot of the committee members did not remember many of the things talked during phone/skype interview. Seeing this pattern, I practiced assuming (during phone interviews) that they do not have access to my application material and explicitly mentioned many of the things already mentioned in my application. Similarly, I treated my onsite interviews as a fresh interview, i.e. without thinking or remembering anything about the phone/skype interview or the application material that I submitted. In other words, for every question they ask, every answer needs to be a complete and ‘full’. It is a bad idea to assume that they must have read ‘it’ in your CV.
  • Practicing mock telephone interviews with my friend Tuan, I found out that there were two things I needed improve. I was speaking too fast and I was repeating a lot. Many times, I would make my point and then explain the same thing again. Practicing mock telephone interviews also helped me to reduce my ‘umm’ and ‘err’ in my answers.
  • To have a clear communication during phone/skype interviews, I found that using earphones/headphones is better than using laptop or desktop speakers.
  • Going to my university’s career center for a practice interview is helpful (although not a lot). It helped me groom my interview skills. These practice interviews are more effective when practiced with people in your discipline. The more I practiced the less I repeated myself.
  • After a phone/Skype interview, it is a good idea to send email to the search committee mentioning that you remain interested in the position. I did not find this really important in our discipline but I found that in some other disciplines this is critical.
  • In my initial skype interviews, I was stressed about remembering everyone’s name and knowing who is asking which question. After appearing in many skype interviews, I realized that names of the interviewers don’t really matter. Also, it does not matter who is asking what question. It is definitely important to prepare ahead of time and know the background of the people asking you questions, but, during the interview, I learned that I need not stress about their names.
  • During the initial months, as a preparation for phone/skype interviews, I had a lot of notes that I planned to refer during the interviews. Unfortunately, they often distracted me. Many times, in trying to find the appropriate note, I did not fully understand the question being asked. Very soon, I realized how important it is to understand the question being asked, the context, and the tone of the interviewer. In later interviews, I slowly grew confident enough to repeatedly ask them to repeat the question. Interestingly, when I did so, many interviewers often apologized for having a question too long to follow and happily summarized the question for me. Of course, it is a good idea to have ‘punch line’ ready-made answers for some common questions but it is equally important to fully understand the questions and their contexts.
Onsite interview:
  • Some members of search committee directly asked me questions like, “What do you like about our department?”. On the other hand, some others ‘studied’ me not by asking questions to me but my letting me ask them questions. It is a good idea to be prepared with neutral questions like, “How is the weather in this city?”, “What are some of the best things you like about this city?”, “What are some places to visit around?”, etc. In my initial interviews, I asked some stupid questions. In one phone interview that lasted more than an hour, the search committee ran out of questions and I also ran out of my questions. Then, they repeatedly asked me to ask more questions (they just wanted to keep talking). I ended up asking if their university was downsizing!
  • At some universities, I was directly invited to an on-campus interview without a phone/skype interview. This was exciting. I later found out that this can be sometimes challenging because the campus, department, and the people can be totally different from your imagination/expectation.
  • Not receiving an offer after my first onsite interview helped me retrospect on many of my preparations. Particularly, I realized how narrow my slides were. I revised them with the help of feedback from many others. For later onsite interviews, as a part of the preparation for the positions that I really wanted, I invited some of my graduate-level friends to attend my practice presentation who gave me a lot of useful feedback. I brought snacks and juices during such practice presentations.
  • My understanding of the dress code computer science was that t-shirts and shorts are acceptable anywhere. I usually feel uneasy putting on a suit and a tie. For my first onsite interview, I never wore a tie. Now, I know that being too formal is always better than less formal. For all other interviews, I dressed all the time. Even for the dinners.
  • After an onsite interview, I often felt like a war was over. It felt like before the interview I had groomed myself and prepared my own tools (answers) and collected tools from others (tips and tricks) to prepare for the war.
  • After the first onsite interview, I realized how important it is to have the itinerary of the day with me all the time. With all the anxiety and stress, I usually forgot what was next. In all future on-site interviews, I took a screenshot of the itinerary and made it my cell phone’s wallpaper for the day.
  • In one onsite interview, someone asked me - “You have a great CV. I believe that with your profile, you must have received some offers already.” If the search committee members like you, they will be eager to find out if you already have another offer.
  • I often use my laptop while I am on a plane. During one of my travel for an onsite interview, a guy spilled wine over my pant, fortunately not on my laptop.
  • For smaller universities, I found that the search committee often fear how my meeting with the students goes, because they know that that’s when I can find out many truths about the campus.
  • Some useful things to carry with you – HDMI cable (to connect your laptop), lint roller, clicker (with laser pointer) and an umbrella.
  • To clear the ‘background checks’ it is important to have all dates on your CV correct. I read somewhere that people who copy others’ teaching philosophy or add fake work/teaching experiences can be caught during background checks. I also read that if any of your content in the application package is altered/fake, you might lose your job any day they find it out. Universities typically ask a third-party company to perform a background check of the candidate either before or after offering the position.
  • Learn the difference between confidence and arrogance.  One can be very confident, polite, humble, and unarrogant, all at the same time. A lot of my confidence grew from the analysis of my own mistakes, failures, and successes.
  • When I started applying for jobs, I was so excited that often times I imagined getting a call from a search committee chair or a department chair and offering me the job (directly without any interviews). Now I can see how na├»ve and overexcited I was.
  • The process of applying for jobs and explaining my research work to others helped me understand the bigger picture of my research. I also realized that having more ‘attractive’ terms/phrases in publication titles and thesis titles, allows us to sell our work easily. Now, when I write titles, I don’t just draw out a title from the work, but I also consider the future impact of the work. I revised my thesis title many times during the job search period.
  • Almost all other universities eventually sent me an email (sometimes even a snail mail) informing me that I was not selected. I received some emails 6 or 8 months later.
  • The most stressful task of all, during my job search, was sending out all the required recommendation letters. My own advisor was very prompt in sending out his letter of recommendation. However, some other recommenders had to be reminded many times. It always felt like I was begging for time with these extremely busy people. Most universities that offered me a position (or were close to making an offer) talked for about 30 minutes with my advisor. Some talked over the phone with all my references.
Here is a plot I created to graphically summarize my applications and interviews (all tenure-track positions).

Some widely read resources:

Thursday, April 20, 2017

Andrew Ng's recipe for Machine Learning development

The 'Dev-Train' dataset is like the validation dataset. Even though the performance of a trained model on 'Dev-Train' is good, it may still be far behind in the 'Test' dataset. The reason for this could be because the Train dataset's distribution may be different from the distribution of 'Test' dataset. For this reason, a small portion of the Test dataset (termed as Dev-Test) can be used to calibrate the training process.

'Training bigger models' and 'Getting more data' are two things that someone can always try.

Data synthesis refers to changing the training dataset so that the performance on the real problem (i.e. Test dataset) can be improved. For example, oversampling the examples which are underrepresented can balance the dataset.


Wednesday, May 6, 2015

Google Drive as my one-stop storage solution

Last week I implemented a one-stop-solution for my storage management. I decided to use Google Drive to keep all my data and pictures instead of using both Google drive and Dropbox. Just because Dropbox gives some space for free does not mean that we have to use it. Actually, I have realized that using multiple storage solutions demands a lot of management and worry time, and so I now decided choose one of the state-of-the-art.

I already had 100 GB of space from the Chromebook I had purchased. To ensure that I don't have to worry about space again soon, I bought extra 100 GB of space for 2 dollars per month. I like the idea of paying $2 per 100 GB.

Currently I am setting up my devices so that I can use Google Drive universally. I do not have to delete others' pictures and videos, instead, I keep them and share them. Ideal for me!

Sadly, while managing my google drive I emptied the trash to recover some space not realizing that I had deleted one of my folders in Google Drive. I had a copy of the files somewhere else so I could restore everything but two google document files. When we delete google document files, they are moved to trash and once we delete trash, there is no way we can recover them. Fortunately, the two files are not a huge impact in my work but the lesson I learnt is - do not empty trash folder of online storage.

Observing how Google automatically organizes photos in folders by creating month folders inside year folders, I am inspired to organize my other files and documents likewise. Organizing folders and files is many times frustrating not because of the actual organizing task but because of not knowing the best way to do so. We can wonder that in future, however, this will not be a big issue because almost everything will be searched instead of explored. I find that this technique of using year and month as folder names to keep files is simple and easy. Also it is wiser to rename files so that they can be searched later instead of explored.

[Update on 9-24-2016: After using Google Drive for a few months, I found the app buggy and slow compared to Dropbox. I still use Google Drive as storage, but for desktop (frequent) activities, I started to use Dropbox again. Currently, I use both Dropbox and Google Drive. However, I know exactly what goes where.]

[Update on 26-11-2017: Since last 7/8 months, I have been using Dropbox only - for storing everything. I use Google Drive only for 'gdocs', 'gslides', etc.]

Saturday, March 7, 2015

started to use only Perl for programming

In April 2014, I realized that I was using programming as a decorative item instead of a tool to get results by focusing on programs more than results. One of my biggest mistake was to use two programming languages: Perl and bash scripting. A little after this realization, I immediately decided to mostly use one everywhere and that would be Perl. Since then, I am using Perl only, wherever possible, and have become more effective programmer. Moreover, subconsciously, I was following the ‘one file one purpose’ philosophy of Unix and open source community, and it was another cause of my less efficiency. My programming experience led me to think that it is better to have 2 files usually: main program that has everything (except for the most common subroutines), and the common subroutines module. This change has helped my efficiency and productivity and also helped me focus on output of programs instead of program itself.

Friday, January 30, 2015

Some lessons I learnt while scripting using Perl and Bash

1. Use '>' instead of '&>' by default for output redirection. While executing commands in Perl scripts, it is not wise to redirect errors/output to a file with &> operator. Errors must be thrown back to the calling program. &> should only be used for logging the output of the root/main command because when we close the terminal we still want to program to print the errors to the same file.
2. Avoid using 'rm -rf ' because it is a dangerous command. Instead use 'rm -f folder/*'. If 'rm -rf $abc' is needed, make sure that $abc has some literals and is not totally driven by used input. In case the user input is null, the rm -rf command can turn into 'rm -rf /'.
3. Export bash scripts (eg. into the job folder and execute them from there. This allows to run only that job during debugging and testing.
4. Unlike Bash scripting, Perl scripts can be edited while the program is running. Sadly, without even knowing that the script is running, I have edited bash scripts many times causing the running program to fail. For this reason, I stopped bash scripting totally. I use Perl for almost everything now.
5. Somewhere I read that for readability using underscore instead of capitalization or hyphens is better. This practice has been useful to me.
6. Use 2 instead of 'to' in identifiers. For example tbl2chimera, rr2tbl, etc.

Wednesday, March 26, 2014

executed "rm -rf /*" today

Lucky me today. I opened one of my bash script, made a lot of changes, and ran it to test. Suddenly, my eyes were wide open as I saw my script attempting to delete everything. I could see permission denied messages everywhere in my screen. Without thinking anything I pressed Ctrl+C and started to investigate what had happened. I had forgotten that bash does not need spaces around the = sign. So, the second command was effective to be "rm -rf /*" which is a command no one ever wants to execute in Linux. I am currently working in a server where me and my friend are working together, preparing for upcoming CASP. Life saved!

Sunday, April 21, 2013

bash scripting - running jobs in parallel - an easy trick

The idea is to write a waiting script and place it right before the jobs we want to run in parallel, inside the loop(as shown in second picture below). The script will sleep for infinite time until maximum number of jobs are running. The algorithm is like this (1) count the number of jobs running that match the string provided, then wait for 60 seconds in loop, if the count is equal or greater than the maximum number of jobs we want to run (2) count and wait repeatedly, until less jobs are running. Once the number of jobs running is less than the maximum number of jobs we want to run, the script terminates. The script I wrote is accessible at

Here is a screenshot of basic logic and also an implementation example:

Monday, March 25, 2013

Plotting data in with R - when data has hyphen(-) along with numbers

While I was plotting data with R, I encountered a weird issue today. I had hyphens along with numbers in my data file, and I was not getting expected results in the plot. This made me think and investigate the issue. I reduced my data size to a small file and found that R actually behaves unexpectedly when there are hyphens along with numbers in the data column. I am moving on by replacing all the hyphens by zeroes, for now. Here are the details of my experiment. I do not know the cause yet, and it could be a bug with R.
Data file (with zeros and hyphens)

R script
Output of the two files

Thursday, November 29, 2012

Reviewing friends’ writings

Last week we were asked to write a book chapter by our advisor. Coordination was difficult because we were six people working on it and we were unclear how the chapter should actually look like. Even though we are all awesome friends, at times, we were angry at each other because of difference in opinions and/or task division issues. Many times, when we reviewed someone's writing in group, we ended up discussing. I think it is natural for any of us to start feeling defensive when we are thrown more than one suggestion in public, especially when we are not prepared to listen to a lot of corrections. (It is my earlier strong experience that suggestions and comments are best made in person, not in public.) Recently, I experienced that we need to be careful during public review sessions. During these public sessions, minor mistakes and absolute corrections (when no other option is correct) are fine. However, general suggestions and opinions (usually which begin with – “I think”, “My opinion is”, etc.) are better kept for later one-to-one talks or informal discussions. Also, when many people are making different comments, it makes the person being commented turn defensive and we end up trying to prove our own opinions. Instead, when we sit one-to-one, we are more prepared to listen about our mistakes and more open to new ideas. Public review should be performed only when all the persons involved have already reviewed the writing individually before the review session. Actually, it is always effective to send the writings to people and let them read it individually and also give them some time to think about it before sitting together for any review. The ideal case would be to get comments through emails and then discuss anything that look like worth discussing after that.

Monday, November 26, 2012

I like Dropbox and the idea of online storage

I am a Dropbox fan. I use Skydrive as well. With these tools, I feel secure and confident as I can synchronize data across my computers without ever worrying about using flash drives or other online solutions. I don't need to think about this synchronization issues at all. I work on a file at my lab, save the file, go home, and continue working on it, without even thinking that I have Internet connected to all devices. I feel so happy when I think of this.
Today I ran into a potentially huge data loss problem. I have 5 operating systems where I have my Dropbox account synced. One of the virtual machines, in which had Dropbox installed, ran out of space and I deleted some files in the Dropbox folder to recover some space, forgetting that I had Dropbox ready to synchronize. In seconds, I lost all of my files from all of my devices and they were not in any recycle bin. I was terrified. That is when I googled and found that Dropbox has a facility to restore deleted file. Nice Dropbox! I easily recovered all of my files in no time.
There is a mistake we might make while restoring. We might restore an older version of the files/directories we want to restore. We need to check the deletion date carefully before restoring. I encourage everyone to use online storage and synchronization tools like Dropbox and stay away from data loss issues, of course, unless the files are something that you never want to share.

My PC checklist (after installing new OS)

Every time I fix someone's PC, I have to recall things that I need to check before I return the PC. I am keeping this this list online so that I can access it, update it, and share it. :)
  1. Update time and timezone
  2. Install LAN Driver
  3. Intall Wifi Driver
  4. Update Windows
  5. Install Chrome/Firefox
  6. Install VLC media player
  7. Install Adobe PDF reader
  8. Install MS Office
  9. Install and Update AntiVirus Software
  10. Check the Internet @
  11. Check webcam @
  12. Check sound playback @
  13. Check microphone @
  14. Install Audio and Video Drivers if needed
  15. Check CDs/DVDs in the DVD drive

Monday, June 25, 2012

Can we simulate protein folding? - "Levinthal's paradox"

Levinthal's paradox

Levinthal's paradox is a thought experiment, also constituting a self-reference in the theory of protein folding. In 1969, Cyrus Levinthal noted that, because of the very large number of degrees of freedom in an unfolded polypeptide chain, the molecule has an astronomical number of possible conformations. An estimate of 3300 or 10143 was made in one of his papers.[1] (Often incorrectly cited as a 1968 paper.[2]) For example, a polypeptide of 100 residues will have 99 peptide bonds, and therefore 198 different phi and psi bond angles. If each of these bond angles can be in one of three stable conformations, the protein may misfold into a maximum of 3198 different conformations (including any possible folding redundancy). Therefore if a protein were to attain its correctly folded configuration by sequentially sampling all the possible conformations, it would require a time longer than the age of the universe to arrive at its correct native conformation. This is true even if conformations are sampled at rapid (nanosecond or picosecond) rates. The "paradox" is that most small proteins fold spontaneously on a millisecond or even microsecond time scale. This paradox is central to computational approaches to protein structure prediction.[3]

Theoretically, a computer could calculate all the possible shapes for one sample protein and select the lowest potential energy one. But in practice, however, it is possible that this process takes longer than the age of the universe to do all the calculations.

Perl and my Coding Practices

  • Underscore and underscore. No dot or hyphen or Capital letters. Use underscore to separate words in variables.
  • Name variables starting from general words and move towards more specific ones. For example, dir_models, dir_models_refined, file_score_contacts, file_score_total, etc.
  • While calling another program or running another perl script, always redirect the STDOUT and STDERR as well. Example, “perl >> log.txt 2>&1 ”, “perl > log.txt 2>&1 ”, etc.
  • Use a subroutine for log and exit instead of the writing three lines of code: log message, close log and exit or die. The subroutine can be something like this:

sub log_and_exit{
     my $message = $_[0];
               log_this("\nError! $message");
     close LOG;
  • The main program that will call all other programs should maintain a log file. So, that we can know where our program is at a given moment. All subprograms/scripts need not maintain a log file. Printing to STDOUT is fine. That way we can test the program individually as well.
  • For reading files from a directory, this could be one of the easiest way: 
my @file_list = <$dir_models_selected/*>
foreach my $model (@file_list) {
     next if ($model =~ m/^\./);
     system_cmd ("cp $model $dir_models_for_3Drefine/") if($flag_run_3Drefine);
  • Always give examples while accepting command line arguments. Here is an example.

my $dir_models            = $ARGV[0]; # jobs/Tc658/models
my $file_contact_info     = $ARGV[1]; # jobs/contact.txt
my $file_ranked_models    = $ARGV[2]; # /tmp/result.txt
my $dir_working           = $ARGV[3]; # /tmp

  • Instead of using system($myprogram) command, use a subroutine to run commands. It can be like this. It is cleaner and saves a lot of time.
sub system_cmd{
     my $command = $_[0];
     $command = $command." >> $log_detailed 2>&1";
     open DETAILEDLOG, ">>$log_detailed" or exit(11);
     print DETAILEDLOG "\n\nExecuting $0: [$command]";
     close DETAILEDLOG;
     my $status = system($command);
     if($status != 0){
           log_and_exit("Failed [$command] $status");

Saturday, May 26, 2012

It took around 2 months to get a tool running

I had been trying to configure IMP(integrated modelling) since last 5 or 6 weeks, along with doing many other things as well, of course. Actually, I needed to run an example on the web-site, that generates pdb file (models) from the amino acid sequence. In around a week or so, I had it running. I could run a few examples that came along with the software, but not the one that I wanted to run. I thought that I did not install the software properly. So, what I did was, went back and tried to FULLY install every piece of dependent tools and software. The most difficult task was to get all of the boost libraries running.

I spent a lot many days trying to get it completely installed. I tried in two different servers. Thinking that it was a non-super user permission issues, I tried installing in a virtual machine as a superuser access. That did not work either. I was frustrated. I had started doubting my abilities. Finally, I wrote a long email to the mailing list with all my configuration and install logs. Turns out that I was mixing up the stable version 1.0 with the nightly builds. I had installed the stable version 1.0 and was trying to run the latest examples that were supposed to run in the latest version. Once, I figured that out, I got it running in two days.

Learnt a lot!