Automatic PDF fetching of articles
Tired of clicking your way to the article PDFs you need? Check this out and find out how you can fully automate this process.
Click to continue reading “Automatic PDF fetching of articles”
Tired of clicking your way to the article PDFs you need? Check this out and find out how you can fully automate this process.
Click to continue reading “Automatic PDF fetching of articles”
For this homepage we wanted to list all our publications as easily and automagically as possible. First, I simply made a Pubmed search, registered the results as an RSS feed and used a wordpress RSS plugin to display our publications. This was nice and simple but there is a catch, whenever we publish something new the RSS feed would be updated to only contain the most recent publication and all our previous publications would disappear and we would have to register the RSS feed again every time a new publication was out there to list all our publications.
Of course we could easily adapt the nice ruby pubmed code that Anders just wrote about to list our publications. Still, I wanted a nice, faster, integrated solution, so I decided to write a simple wordpress plugin in PHP to do the job. It is really quite simple, I just use the “file” function in PHP to read a URL, into a string variable, where I give the search term we want to query Pubmed for. Then I split the string, using a regular expression, to give me an array with one publication in each position (except the first and partly the last). The Wordpress plugins has some nice features to customize the output using css and an admin interface to the plugin, but if you do not use wordpress you could put the following function into your PHP file.
function PubmedList($searchterm) { //Add the search term to the URL $url = 'http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=search&dispmax=500&Term='.$searchterm; //Read the search result into the $html variable $html = implode('', file($url)); //Add the ncbi server prefix to the local links $html = str_replace('href="/pubmed/','target="_blank" href="http://www.ncbi.nlm.nih.gov/pubmed/',$html); $html = str_replace('href="/sites/','target="_blank" href="http://www.ncbi.nlm.nih.gov/sites/',$html); //Split the $html into the different papers, the first and last element will have to be corrected $papers = preg_split('/<div class="rprtNum".*?<\/div>/',$html); //Remove the end on the last element so it only holds that last paper $tmp = split('<div id="PaginationNode2"',end($papers)); array_pop ($papers); array_push($papers,$tmp[0]); $first = 0; $antal = sizeof($papers) - 1; echo "<br /><h3>There are " . $antal . " published papers</h3><br /> <br />\n"; //Add some divs that were missing since you regard the first element echo '<div class="pubmed"><div class="rprt">'."\n"; foreach ($papers as &$p){ //Don't print the first element since that is everything before the first paper if($first){ echo $p ."\n"; }else{ $first = 1; } } } </div></div></div>
Then you could call this function using something like this:
PubmedList('Torarinsson+E[Author])+AND+(2003%2F01%2F01[PDAT]+%3A+3000[PDAT]');
Which would list all my papers since 01/01/2003.
NB: To get a good search query, visit Pubmed and search for what you want (for example using “Advanced search”), once you have the results you want to click “Details” and then on “URL”, which will take you to the page, but this time displaying the search term you want in the address bar after Term=