It’s been a while in deed. I had to take my time to fix things in a lot of sides of my scientific life, and to feel ready to get back to my blog activity. In a 4 years- long PhD project, the second year always takes a little crisis along. Nothing special, and nothing really uncommon in the complex and diverse postgraduate world. I had to make up my mind on a couple of things, restore my initial motivation and organise my activity to close the ongoing projects and go further till the end of my thesis. Luckily, and thanks to some very important people that are close to me, I am now taking the courage to make some things clear and try and overcome my limits. Continue reading Starting back over again. Welcome back.
If you’re up for some Sunday biological geekerys, you might enjoy this video introducing the iDu Optics’ LabCam microscope adapter, that will fit your iPhone into a microscope ocular to show the image on the screen. On my old blog ATCGeek, I wrote a couple of comments on the possible role smartphones might have in biological research, and described some Android apps for genome browsing, or that geeky idea to build a microscope with a smartphone and a couple of pieces of plexiglass. Despite the comment of many is that these devices won’t be of much help in wet and dry labs, we can affirm that they can still provide “a little help” in many situations, such as easying the visualization of a genome sequence when you cannot leave the bench, or just helping in a better and more comfortable visualization of a microscope sample, as in this case.
iDu Optics LabCam Microscope Adapter is mostly designed to work for iPhone, even if a Samsung S6 version is available. You can buy it on the website via PayPal, or on Amazon, but the price is still someway prohibitive. All the models range around 250$. Ok, you might want not to go blind each time you have to view something on a microscope, but maybe this is not worthwile this price, that is the only read drawback I get to see in this amazing product.
Right about one year ago, I was sharing a flat with some Spanish guys in the deep heart of Grácia, an historical neighbourhood in Barcelona. To be honest, those guys fitted quite well into the definition of “friki” – Spanish transliteration of the term “freaky” – that indicate that kind of people attracted by oriental spirituality, organic food, ecological behaviours, flea market handmade clothes and hemp derivatives of all kinds. Boldly and briefly: hippies. Being an ecologist activist with radical autonomist positions (I am a bit hippie too), I tend to have a good relationship with this kind of people, at least till the moment when they understand that I am working in Science, in Biology, and most importantly in Plant Biology. The path from me explaining my work, and they asking about GMOs is very short, and my efforts to explain that I just study the evolution of plants without modifying them are normally useless. And right about one year ago, I had to spend a whole afternoon defending my work, and debunking a lot of misconceptions of them. Continue reading GMOs, flatmates and lateral gene transfer.
Time for late confessions. This story dates back to the 2000s, and it is about free diffusion of knowledge, internet, evolution, an angry academic and a monster (I mean “another monster”, distinct from the angry academic). It was the 2007, and I was hardly trying to find a way out from my bachelor degree. At the time, I had to do the exam of Evolutionary Zoology, which classes were held by Professor Raffaele Scopelliti at the Dept. of Zoology of the Sapienza University. I was never the one for sitting in a classroom, and skipped as much classes as I could. It was permitted, and the schedule was so terribly organised that was really hard to embed your commitments. This was the life at the Italian university during the 2000s. Courses overlapped, the thesis lab-work could take all your day, and the best you could do to survive was to choose a comfortable library to sit down and study for extra-session exams. I was told that things turned slightly better in the last years, but then this confusion matched a lot with my natural “too cool for school” attitude, taking me far from lessons very often. During the spring of 2007 I had few time to study Evolutionary Zoology, I did not attend the course, and needed a solution. Usually, the solution in these cases was to grab someone else’s notes, and one day my friend Amro showed up with a copybook full of notes from the Prof. Scopelliti’s lessons. The notes belonged to a girl I never knew the name. Amro had to return the notebook to her soon, and suggested me to photocopy all the pages.
I started to rewrite the notes on Google Docs, organising them as a real text book, with chapters, sections, headings and all the rest. It was tough sometimes, since the photocopies of a handwritten text are hard to read, and often it turned to be a matter of free interpretation. Also, from time to time I found my activity quite boring, and since I am a huge fucker, I started to thread jokes and foul language in my writing. As said, it was the 2007, and that story of the Flying Spaghetti monster was just starting to spread in Europe. As I started to rewrite the lesson that swiftly (and of course critically) described the alternative evolutionary theories, from Lamarck to Creationism, I had the brilliant idea to insert a description of the Flying Spaghetti Monster theory, taking care to mention that was a hilarious fact.
The exam day had come, and the result was strikingly good: 30 out of 30, the best mark you can get in the wierd Italian evaluation scale. The real problems arose later. I was very active in promoting things such as open science and the free distribution of knowledge at the time, and the best I could do in my own little was to publish my notes on a biology students unofficial forum we had (there was no Facebook yet, oldie me). The response was good. Students appreciated the initiative, the link was diffusing very quickly, and people was quite happy to read notes where some joke could eventually pop up from time to time and kill the bore. Unfortunately, a couple of months later, I spotted a post on the message board of the official website of the Faculty of Biology. It was authored by Raffaele Scopelliti, and the title was “Warning on Evolutionary Zoology fake-notes“.
I opened the message and the body was imperative and threatening. I don’t remember the exact words, but it sounded like this:
Dear Students, someone has published some very inaccurate and awkwardly incorrect Evolutionary Zoology notes that are referred to my lessons. I gave no permission to publish them. I urge you to quit studying from them. I don’t know the author of this brilliant work, but I swear that I will find this guy.
I gave it no much importance. My exam was done and registered, and I was far and safe from professor’s anger. But later on, I was explained how he came across my notes. First, some transcription errors spread out, becoming very popular among the students, just like the Haeckel’s Biogenetic Law that was written as “Haeckel’s Progenetic Law” because of a misreading of mine. Also, it seemed that anyone really liked the story of the Flying Spaghetti Monster to the point that many people reported it during the exam. In Italy the most of the exams consist in oral interviews, and those present told me that professor Scopelliti, after having heard the story of the monster for the umpteenth time, literally started to yell “who told you about this damned monster”?
The fact itself is funny, expecially if you consider the very formal italian academic environment. I admit that my story is of small interest, but I guess we could learn something from it. When I published online my document, I carefully and repeatedly warned the people that they had to check everything on it, that those pages represented just a raw product, and that it was full of inaccurancies to be corrected. Actually, this story taught me something on the way university students do their work. The most of the times people is so focused on learning as most notions they can, without giving the due consideration to the critical review. At the time, it made me think. I knew I wasn’t any better than the most of the people, and the same lack of criticism that gets students to talk about flying monsters in an Evolution exam could have affected me as well. Also, it was the first time when I experienced the danger of freely diffusing information on the internet, and some long reflections could be made on this point too.
But this is mostly a post for a late confession. Dear prof. Scopelliti, I have no idea whether you will ever read these lines or not, but I guess that you might remember this story. I just want you to know that it was me, that I am still trying to make my way in Evolutionary Biology, and no. I don’t apologise for what I did.
My notes were still better than the nothing you shared as course materials.
Being in science is basically a matter of applying for a position. As you graduate in your master, you swiftly approach your first “application round” for a PhD, and after your PhD, the endless post-doc route will take you into several cover letter writing, interviews and candidate selections. As an high level scientist commented to me once, after that his institute rejected (quite harshly I’d say) my application, it is fundamental to be able to properly attend an interview, because scientific institutions are importing the recruitment practices from companies. But how much are they effective?
In this TED talk, Regina Hartley, New York born Senior Professional in Human Resources (SPHR) from the HRCI, explains us that the right candidate may not be the perfect one. As you screen your candidates profile, one of the things that should matter is how much the person you are analysing demonstrated the capability to sneak out from hard conditions. Thus, an honour graduate in a prestigious university may not be as effective as someone who came out from a public university and had to face several difficulties in his/her life.
Here’s the link to the video in case player above does not work (I am having problems with that, actually).
Sifting through my website stats, I realised that bioinformaticians are reading more posts discussing “how to do bioinformatics” than the ones with a strict scientific content. Is this a feature of this blog, or does it reflect a common problem with working habits?
Drawing some conclusions after two years of atcgeek
After almost a couple of years blogging on atcgeek, I can dare to say a thing or two about this experience. If I scroll the statistics of this blog, I cannot really complain about the interest generated in the readers. Even though I won’t become famous by writing here, I can tell that 38k views from January 2014, and peaks around 1k views/day is not a bad result, considering the long pause I had to take whilst moving to Barcelona and starting my PhD. Nothing really special, but not even a disastrous failure.
The three main topics at atcgeek
Although I use to divide my posts into thematic categories (bioinformatics, biochemistry, structural biology, etc.) and into types of article (news, insights, video, hacks and personal blog), I realised that I basically tend to write on three topics: education and work practices, methods and reflections. The posts of the first kind are about “how to work in bioinformatics” or “where to learn the basics”. The second ones are the one in which I report the new methods that have recently published, and the third category recoils the posts that propose scientific insights on the role and the nature of computational and theoretical biology.
The most of the interest goes to educational and work habits posts.
The order in which I mentioned these three topics, coincides with their ranking in terms of interest generated. Education and work practices come first, methods are the second ones, and the bronze medal goes to the insights. Swiftly and boldly comparing my site statistics with the interest generated on social network, I can dare to say that the people who read atcgeek are particularly interested in discussing about how to improve their working habits, how to start working in bioinformatics, or to share a bit of self-irony with me as I talk about the shit I use to do when I work. Take it as an impression that is barely supported by statistics, but plausible enough to put a question.
Based on what I see on atcgeek, people is more into discussing how to do bioinformatics or how to learn the basics, rather than the bioinformatics itself, and there could be some reasons behind this.
Of course, we should keep clear that this blog is written by a PhD student who is sharing his experience while walking the first steps in computational biology. This is a point, since anyone would be more interested in the opinions of someone more influential than me for anything about the “scientific part”. The main goal of this blog is to horizontally share my experience and to productively interact with my visitors, more than claiming to be an expert of the field and aiming at “coaching” the readers. On the other hand, anyways, if experience matters, it should matter in both the topics, since the thoughts of an experienced scientist are more evaluable than mine in both work habits and in science.
Do we have a problem with how to do our work?
Despite the shift I am seeing in the readers’ interest may be due to the characteristics of this blog, I still have the feeling that “how to work” is the major “hot topic” in bioinformatics community, and we may strongly suspect that this reflects a problem. Bioinformatics is basically the domain of non-computer scientists working with computers, the merge of two super-rapidly changing sciences, and the development of proven, shared and consolidated work strategies is far to be a reality, especially if compared with experimental biology, were lab practises are widely discussed and protocols are consolidated.
There is one last thing to say. In the real ranking of the most visited post, the most read one is not really about bioinformatics. Let’s say that this discussion should be focused on what bioinformaticians use to read online when they are keen to read about science. Including the other interests could be puzzling.
BTW, thank you for the interest in this stupid diary.
It was a cold November morning, year 2011. Sapienza University has a huge campus next to the city centre of Rome, where the main faculties are stored in huge buildings in the rationalist style. Yet, the faculty of Biochemistry has a detached site in the neighboured flanking the campus, San Lorenzo. I was crossing the streets of this wonderful ex-industrial alternative hood to reach my new lab. The clock was marking 10:30 AM, and I was joining bioinformatics. Professor Stefano Pascarella had accepted to supervise me in my master thesis, and it was my very first day. Four years have passed, I have graduated, worked in five different labs, and even if my experience is not really long, I think I have already a couple of stories to tell.
Stupidity matters. Despite the most of the people use to link science to intelligence and genius, seeing research as a matter of the “smart guys”, we must admit that the lab routine is often studded with the crap we make, and that researchers can become protagonists of actions of remarkable stupidity. And if we scan the first, faltering steps of a researcher’s career, we may find a couple of funny nerdish stories to tell with colleagues in a bar. And since I’d be so sorry to know that someone of you may run out of funny anecdotes about grad students’ stupidity to tell, let me report the four most stupid things I have ever done in bioinformatics.
Trying to fetch information from uniprot on 1750 genes without any programming
The first task of my master thesis was simple. My advisor provided me with a list of 250 uniprot IDs of MocR proteins in several bacterial genomes. Helix-turn-helix transcriptional factors, with an amminotransferasic domain allosterically regulating them by pyridoxal-5’-phosphate binding. The lab had identified these sequences with HMMer, and we wanted to know something more about the flanking regions. The professor told me to annotate 3 upstream and 3 downstream coding regions in order to see wether some recurrences could indicate a conserved multigenic region; simple and straightforward.
The next day I was shattered, reclining a lost look on my screen, at 8 pm and after ten hours of work. A hard lesson that I have learned by the time, is that if you did something wrong in designing your bioinformatics workflow, a spreadsheet will show up at a certain point. I was staring at an OpenOffice Calc window with about 40 rows, and had managed to find a way to manually scan the flanking region. I don’t remember exactly my glorious strategy, but it should have sounded like this:
- Copy and paste the id on uniprot and search it.
- Scroll the way down to the crosslink pointing at a graphical genome browser and open it.
- Perfect, you are on the spot! Now move the browser forward and back, you will find the flanking sequences.
- Select any flanking gene in the interval and make your way back to uniprot
- Save the information you get (the Uniprot ID basically) on a spreadshit and go on
I was then suggested to stop doing this and go further with studying python. That was the day when I learned that there is no bioinformatics without programming.
The protein-DNA docking to fetch promoters.
After the first explorations, the final goal of my M.Sc. thesis work became the identification of a conserved promoter region upstream the neighbouring genes pdxS and pdxT, coding for the two subunits of the pyridoxal-phosphate polymerase holoenzyme in bacteria. This memory tastes a bit sweet, as usual when you end up remembering how naive you were when just a newbee. It was the early 2012, January or maybe Feburary. During a lab meeting, I argued that a good option to find our promoters was to perform a docking analysis on a set of candidate promoter sequences, docked with the MocR transcriptional factor that was found activating their transcription. After having explained my point, I realised that anyone was just looking at me with dismay. Do you know that awful feeling of anyone in the room looking at you like you’re crazy? I was explained that the methods developed for protein-DNA docking were still too ineffective to fetch a reliable result. Protein – DNA docking to infer the binding region of an HTH? Pure science fiction. At least, that day I have been introduced into one of my favourite topics in bioinformatics: the communication between DNA and proteins.
Declaring profanities as variables in your code.
Even if I am quite used at threading jokes in my code, taking it as a “nerdish rebellion” against my even more nerdish work routine, what I am going to tell here didn’t actually happen to me. I include this story I have heard of in my post because it’s really worth reading.
In team-working sharing code is fundamental, and the best habit you can take is to write variables in a human language, and to write proper comments in order to get the people who will read your code to understand it (to any possible extent). Anyway, the first thing you should care about before sharing your code is to make sure that it won’t worsen the opinion your colleagues have about you.
This story has all the ingredients that a good academic joke needs to succeed: a polite and old-mannered thesis director, a graduate student with a sense of humor that his advisor won’t get, swear words, profanities, and a Perl script to show them up.
Stefano Pascarella is not old at all, but he is still the kind of super-mannered and polite Italian professor. I worked in his lab for two years long, and never heard him yelling at anyone or just expressing disappointment with harsh. Quite remarkable, since he was my thesis advisor. Instead, I never met the student who’s the protagonist in this story, and I can just assume him as the typical 20-something master student. The only thing that I am pretty sure about him is that one day he wasn’t at the lab, and his code was needed for some reason.
Professor Pascarella sat down in front of the terminal and rapidly found the file he needed. The people who told me this story just can’t forget the expression on professor’s face. A calm and bored expression ran immediately into a serious face, that swiftly faded into disconcert. Any given variable of the code he was reading was either a bad word or a profanity.
Later on that day, the student received a mail “kindly asking” him “to take his coding routine more seriously”.
Ignoring the find/replace function in a text editor.
Ok, I am figuring out what you are thinking. “This moron didn’t know that text editors had a find/replace function and corrected a whole code manually to change a single word”. Not so, I did something that is possibly worse. When I started to write code, actually I did not know much about the existence of this amazing function in my text editor, but I was still very sure that the process had to be automatised. My ignorance on text editors mixed dramatically with my inclination to programming to give rise to one of the most stupid things I have ever done.
As I finished and tested the script named changeword.py, I was totally sure that it was one of the best things I could produce with my short programming experience. I don’t really remember the code, but it should have sounded as follows:
filein = sys.argv
word_to_change = sys.argv
replacement = sys.argv
a = open(filein,’rU’)
b = a.read()
To run it, you just needed to input the file and the word you wanted to change with its replacement, and anything went to the standard output:
$> ./chageword.py my_file.txt first_word second_word > my_corrected_file.txt
Et voilà, the text came out changed. Luckily, at a certain point I realised that my fantastic script didn’t work for any change I could need, and decided to discuss this problem with a postdoc in my lab. He is still laughing about this.
Write the MD5-checksum code on the same file from which I extracted it.
This happened a few months ago. Tracking your input, output and script files is very important, and even if we are not used at version control systems, annotating any file with its MD5 code may help, to some extent, in having a better tracking of your work.
The MD5 algorithm assigns a unique code given an input. If you input a file to the MD5, the output code will correspond to that file univocally. Of course, if you modify the file the resulting MD5 code will change.
I was finishing a long scripting course and was adding information on my output tabbed file in an hashed header. As I calculated my MD5 code, I had the brilliant idea to write it on the same file from where I extracted it. Not to mention that after having pasted the MD5 code on the file, the MD5 code of that new file inexorably changed.
It took to me a good quarter of hour to realise it. It was 9 PM, and I thought it was just my brain asking me to go home for some rest.
As I said at the beginning of this article, stupidity matters. And ironising at yourself matters even more. Cognitive work requires the application of all your rationality, and it is thus fundamental to understand its limits, or else the borders of your intellectual skills that are shaped by stupidity. I think that there is no shame in recognising you own limits, and publicly admitting them is someway therapeutic.
Quoting an Italian PhD student I have met at my department who recently graduated, “there is no use for a PhD course except in the light of understanding how stupid you are”. I have recently registered for my second year of PhD here at the CRAG, and still have a long way ahead to explore the deepest corners of my stupidity.
After all, the Diesel advertisement showed as heading image of this post, may be right. You are stupid only if you try to explore your limits. And this is right about what I am up to.