End hiatus.

Today we (Schlaufman and I) posted our latest paper on extremely metal-poor (EMP) stars to the arXiv.

Extremely metal-poor stars are interesting because they uniquely inform us to the early chemical state of the universe, amongst other things (metal-free stellar populations, supernova, etc). Unfortunately EMP stars are extremely rare and usually intrinsically faint. In fact, progress on identifying and characterising EMP stars is limited because of how faint these stars typically are.

To address this, Schlaufman and I have developed a novel selection technique that identifies intrinsically luminous EMP stars using only infrared all-sky photometry. There is good astrophysical basis for our selection, which we have iterated upon with a data-driven apparoch. Our selection is as efficient as existing techniques but the candidates we identify are typically 3 magnitudes (x1000 times) brighter than other groups. That means it takes ~15 minutes to get good (high-resolution, high S/N) spectra for these stars, instead of the ~4 hours that would be required for targets identified by other methods.

Using only infrared photometry has a number of advantages over existing selection techniques. Unlike objective prism surveys, our selection works well in crowded fields. Additionally, the effects of dust is ~50 times less in infrared photometry than the optical. That means our approach is uniquely suited to places with high extinction (e.g., the bulge, where most Population III stars are expected to reside). And since our input photometry covers the entire sky we can focus on the Northern hemisphere, where there has been relatively little work on searching for extremely metal-poor stars.

Now that we have proved our selection we are increasing our rate of follow-up: next semester we are submitting proposals for telescope time on five different telescopes (between 2.5m-8m) to exploit our novel technique. Hopefully the telescope time allocation committees will take note of our quick turn-around in this paper: most of our 506 stars were only observed 11 weeks ago! And a lot of that time was spent with Schlaufman and me debating as to who would lead the first paper. We were both arguing for the other to lead.

In addition to calculating ensemble (homogenised) parameters for the sample of CoRoT stars in the Gaia-ESO Survey this week (blog post to appear later), I’ve been working with a student of Thomas Masseron’s. Masseron wanted to know if we could identify spectroscopic binary systems from limited, noisy photometry alone, and infer the system properties (e.g., stellar parameters of both systems, mass ratios). It’s a cool problem for a lot of reasons.

Spectroscopists often just throw away the binary systems because they aren’t worth the effort to analyse. The fraction of data thrown away for this reason is of order a few percent. That’s a lot of stars for big surveys, which means being able to identify these objects from photometry is a big win. There are obvious scientific extensions too: the binary fraction itself, binary fraction distributions for multiple populations within globular clusters, mass ratio distributions, etc. Without any astrophysical priors on mass/radius/luminosity ratios, it turns out you can identify these systems very easily with modest photometric data. However as one might expect, the quality of inference is dependent on the properties of individual systems: stars of similar mass and evolutionary states are much harder to distinguish, because you’re essentially just seeing a not-quite-right blackbody curve. The student (L. Orfali) will investigate the inference quality for different binary system properties, and see what is the minimum photometric quality (and in which bands) are required to constrain these systems. Spectroscopic modelling will occur next week too, but that part is trivial and easier to intuit.

Laura Watkins (STSCI) gave an excellent talk this week on the possible existence of an intermediate mass black hole at the center of omega Cen. Lots of exquisite data (HST and ground based spectra), with very detailed modelling. It’s an awesome project!

Rule of Observing

The first rule of observing is: you don’t leave the telescope until your data are reduced and analysed. If that seems like too much to ask then you’re using old analysis approaches and your competition isn’t.

When I was learning how to analyse high-resolution stellar spectra I wrote an intuitive, graphical software package for analysing spectra quickly and precisely. What used to take ~1 day per star now takes a couple of minutes, and it means I (and now, all my collaborators) follow the rules! It means we can vet candidates quickly, find the most interesting objects and return to them in the same night. Now the reduction takes more than an order of magnitude longer than the analysis! The code is described in Chapter 3 of my thesis, and a screenshot is below. There are more objective (read: better) ways to do stellar spectroscopy – and I will post about this in the future – but the code allows us to get a very good idea on what we’re looking at, very quickly. That’s important.


The last three nights I’ve been observing on Magellan (with Schlaufman) using the MIKE spectrograph, looking for extremely metal-poor stars using a novel technique devised by Schlaufman and me. The selection approach is as efficient (or more) than existing techniques, but the candidates are ~3 magnitudes brighter. That makes the requisite follow-up spectroscopy achievable for a large sample of stars. And our approach only uses global existing sky surveys, so targets are available throughout the year no matter where you’re observing from. The approach will appear in print later this year.

You should always reduce your data carefully by hand. Unless you’re lazy or time-poor. If that’s the case and you’re using MIKE on Magellan (where this post is written from) then the CarPy pipeline will do a pretty good reduction for you in most cases.

However it turns out it’s broken on the Las Campanas Observatory computers. Here’s how to fix it:

setenv PYPREFIX /usr/local/CarPy
setenv PYTHONBASE /usr/local/CarPy/builds/Darwin.10.6.x86_64/
source /usr/local/CarPy/Setup.csh 
setenv PATH /usr/local/CarPy/builds/Darwin.10.6.x86_64/bin:/usr/local/CarPy/builds/Darwin.10.6.x86_64/Python.framework/Versions/2.5/bin:/usr/local/CarPy/dist/bin_local:/usr/local/CarPy/dist/bin:/usr/local/CarPy/dist/bin_oldnumeric:/usr/local/CarPy/builds/Darwin.10.6.x86_64/bin:/usr/local/CarPy/builds/Darwin.10.6.x86_64/Python.framework/Versions/Current/bin:/usr/local/CarPy/dist/bin_local:/usr/local/CarPy/dist/bin:/usr/local/CarPy/dist/bin_oldnumeric:/Library/Frameworks/EPD64.framework/Versions/Current/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/texbin:/usr/X11/bin:/usr/local/wcstools/bin:/usr/local/isis/bin:/usr/local/cdsclient/bin:/Applications/itt/idl/bin:/usr/local/magellan/bin:/usr/local/lco/bin
setenv PYTHONPATH /usr/local/CarPy/dist/lib_local:/usr/local/CarPy/dist/lib:/usr/local/CarPy/dist/lib_oldnumeric
setenv PYTHONDATA /usr/local/CarPy/datafiles

Now you can follow the instructions properly. But there is one additional step for the blue arm. After you’ve run this step:

mikesetup -db DATABASE_FILE -blue -all -mk Makefile

You will need to add a flag in the lampblue/Makefile file before running make. To find the right line number (it’s usually 59):

cd lampblue
grep -n mikeMatchLamps Makefile
59:	mikeMatchLamps lampblue_lamp1136fbspecs.fits -x 5 -o 4

Then just add -maxsh 300 so it looks like:

grep -n mikeMatchLamps Makefile
59:	mikeMatchLamps lampblue_lamp1136fbspecs.fits -x 5 -o 4 -maxsh 300

And now you should be good to make.

Triple J is an Australian radio station and every year they run the Triple J’s Hottest 100, a democractically elected pick of the top 100 songs produced in the previous year. It is the largest democratic music election in the world, and each year it becomes more popular. Every person can vote for 10 songs, and on Australia Day they count down to #1. Any song is eligible for a vote, but Triple J usually only lists the ~top 1,000 songs on their website.

Last year I wanted to make “The most informed decision I ever made” – I would listen to every song on the Triple J website, give it a score, and then chose my top 10 from my highest rated songs. It took around 2 weeks to listen to every song, and there were certainly some crappy songs. But after all of it, I had a great playlist of songs with “4 or more stars”. Last year I had to write some Python code to scrape all the songs from Triple J, search YouTube, download the video from YouTube, scrape the audio to MP3, and put it in an iTunes playlist.

This year it’s even easier because they have put all 1,008 songs in a Spotify playlist. Here’s what I didn’t do, but what I would do if I wanted to grab all these songs:

Steps to making the most informed decision you’ll ever make

** Note: Read all the steps first, you might find you can skip Step #1 :-) **

  1. Open Spotify and find the Triple J Hottest 100 Candidates playlist

  2. Select all songs, copy, and paste to notepad. This is what it should look like. Save this file as hottest-100-candidates.txt in a new folder.

  3. I’m assuming you have Ruby installed here. If so, from a terminal use gem install spotify-to-mp3

  4. spotify-to-mp3 hottest-100-candidates.txt will find the artist and name for each song, search for it on Grooveshark, and download it to the current directory.

  5. Add all your songs to an iTunes playlist. Listen to it. Rate each song out of five stars as it ends.

  6. Vote!

After I had listened to ~2 weeks of music last year, carefully rating each song, I forgot to do the last step. So for me, “The most informed decision I ever made” became “The most informed decision I never made”.

In this post I’m going to give some very basic examples on how to get Python and TOPCAT (or other VO/SAMP applications) to talk to each other. The Python module you’ll need is called SAMPy. This module will eventually be incorporated into the AstroPy package. To install SAMPy:

pip install sampy

(or if you must, use easy_install sampy)

For our first example we’ll get TOPCAT to notify Python when we highlight a point or row in TOPCAT:

""" Interact with TOPCAT via SAMPy at the most basic level """

import sampy

if __name__ == "__main__":

    # The 'Hub' is for multiple applications to talk to each other.
    hub = sampy.SAMPHubServer()

    # We need a client that will connect to the Hub. TOPCAT will also
    # connect to our Hub.
    client = sampy.SAMPIntegratedClient(metadata={
        "samp.name": "topdog",
        "samp.description.text": "Live demos are destined for disaster."

    # Create a 'callback' - something to do when a point or row is highlighted in TOPCAT
    def receive_samp_notification(private_key, sender_id, mtype, params, extra):
        print("Notification of {0} from {0} ({1}): {2}, {3}".format(mtype, sender_id, private_key, params, extra))

    # Register the callback
    client.bindReceiveNotification("table.highlight.row", receive_samp_notification)


  1. Run the above code by putting it in a file named basic_example.py then from the terminal write: python basic_example.py

  2. Open TOPCAT and load a file. Ensure there are 3 icons in the SAMP Clients tab at the bottom of the TOPCAT GUI.

  3. In the “Current Table Properties”, make sure the “Broadcast Row” icon is ticked.

  4. Highlight a row and look at the Python output:

In [1]: run -i basic_example.py
[SAMP] Info    (2013-12-10T16:01:45.882344): Hub started

In [2]: 
Notification of table.highlight.row from table.highlight.row (cli#3): 5338b24be010f6ca598c744f3eea3afc, {'url': 'file:/Users/andycasey/thesis/presentations/2013/2013-csiro-astro/data/fld_list_230611', 'row': '4'}

I noticed something weird today. The exact same inputs and code were exhibiting completely different behaviour on two different clusters. The only difference between them was SciPy versions: 0.10.1 (correct behaviour) and 0.12.0 (incorrect behaviour). Here’s the line in question:

p1, cov_p, infodict, mesg, ier = scipy.optimize.leastsq(errfunc, p0.copy()[0], args=args, full_output=True)

The correct behaviour on 0.10.1:

ipdb> scipy.__version__
ipdb> errfunc(p0.copy()[0], *args)
array([ 0.06799529,  0.07318012,  0.06680378,  0.05200964,  0.05814424,
        0.09025226,  0.09680308,  0.05702837, -0.14674592, -0.22665459,
       -0.15485406, -0.01311882,  0.08502507,  0.10292671,  0.08557168,
        0.05098229,  0.04956718,  0.06520266, -0.05950772, -0.29728424])
ipdb> scipy.optimize.leastsq(errfunc, p0.copy()[0], args=args)
(array([  4.78875656e+03,   6.67606610e-02,   6.42906789e-01]), 2)

The incorrect behaviour on 0.12.0 (after excluding all other differences and possibilities):

ipdb> scipy.__version__
ipdb> errfunc(p0.copy()[0], *args)
array([ 0.06799529,  0.07318012,  0.06680378,  0.05200964,  0.05814424,
        0.09025226,  0.09680308,  0.05702837, -0.14674592, -0.22665459,
       -0.15485406, -0.01311882,  0.08502507,  0.10292671,  0.08557168,
        0.05098229,  0.04956718,  0.06520266, -0.05950772, -0.29728424])
ipdb> scipy.optimize.leastsq(errfunc, p0.copy()[0], args=args)
(array([  4.78874773e+03,   7.96486918e-02,   4.68803543e-01]), 2)

You can see that errfunc behaves the same way, but scipy.optimize.leastsq does not. Well, if you ever have this problem too then all you need to do is edit the epsfcn flag. The epsfcn flag is described as:

A suitable step length for the forward-difference approximation of the Jacobian (for Dfun=None). If epsfcn is less than the machine precision, it is assumed that the relative errors in the functions are of the order of the machine precision.

In scipy 0.10.1 the default value is 0.0, but in 0.12.0 the default value is None. In this example, 0.0 and None are very different beasts, which makes the default behaviour for scipy.optimize.leastsq unintuitively different between versions.

On 0.12.0 (the previously ‘incorrect’ behaviour):

ipdb> scipy.__version__
ipdb> optimize.leastsq(errfunc, p0.copy()[0], args=args, epsfcn=0.0)
(array([  4.78875656e+03,   6.67606608e-02,   6.42906789e-01]), 2)
ipdb> optimize.leastsq(errfunc, p0.copy()[0], args=args, epsfcn=None)
(array([  4.78874773e+03,   7.96486918e-02,   4.68803543e-01]), 2)

So there you go. If you’re using scipy.optimize.leastsq, make sure you specify epsfcn as 0.0 (or whatever) to be sure your code is future-compatible.

I use git everyday. You should use git or some other git-esque system when writing research papers, because it’s a great way to track all of your changes. The real-world problem is my co-authors don’t use git.

Typically I’ll draft a manuscript, distribute the document (PDF and/or LaTeX) by email, and wait for feedback. Some will provide changes to the LaTeX, others will annotate the PDF, some will provide itemised text responses, and some will print it, scribble on it, and hand me a butchered manuscript.

After sending the manuscript around once to everyone, I don’t want them to have to read everything through again: they should just notice the changes. It’s easier, and faster for everyone. To accomplish this I’ve installed latexdiff. It’s a Perl script that highlights the differences between two TeX files. You can download it here, or just read about it.

Once latexdiff is installed, let’s initiate a git repository and start writing a paper.

mrmagoo:research andycasey$ mkdir my-paper
mrmagoo:research andycasey$ cd my-paper/
mrmagoo:my-paper andycasey$ git init
Initialized empty Git repository in /Users/andycasey/research/my-paper/.git/
mrmagoo:my-paper andycasey$ echo "This is a fake paper to test latexdiff" > README
mrmagoo:my-paper andycasey$ git add README 
mrmagoo:my-paper andycasey$ git commit -m "Initial commit"
[master da886dd] Initial commit
 1 file changed, 1 insertion(+)
  create mode 100644 README

When I make major revisions to a paper (e.g., when I send out copies to co-authors), I want to use latexdiff to automatically create a file that highlights the changes from the previous version. Let’s set up a post-commit hook by putting the following code into a new file in your folder called .git/hooks/post-commit. Make sure this is executable by using chmod +x .git/hooks/post-commit. Now any time we commit to the repository, this script will run.


# Post-commit hook for revision-awsm-ness

function gettempfilename()
    if [ -e $tempfilename ]
    echo $tempfilename

num_revisions=$(git log --pretty=oneline | grep -ic "revision [v\d+(?:\.\d+)*]")

# See if there are at least two revisions so that we can do a comparison
if [ $num_revisions -lt 2 ]

    # Check to see if the last named revision is actually the commit hash that just happened
    current_hash=$(git rev-parse HEAD)
    current_revision=$(git log --pretty=oneline | grep -i "revision [v\d\.]" | grep -oPi "v\d+(?:\.\d+)" | head -n 1)
    most_recent_revision_hash=$(git log --pretty=oneline | grep -i "revision [v\d+(?:\.\d+)*]" | head -n 1 | awk '{ print $1 }')
    # If the last commit wasn't the one that contained the most recent revision number, then there's nothing to do.
    if [[ "$current_hash" != "$most_recent_revision_hash" ]]; then

    previous_revision=$(git log --pretty=oneline | grep -i "revision [v\d\.]" | grep -oPi "v\d+(?:\.\d+)" | sed -n 2p)
    previous_revision_hash=$(git log --pretty=oneline | grep -i "revision [v\d+(?:\.\d+)*]" | sed -n 2p | awk '{ print $1 }')

    # Use the most edited tex file in this repository as the manuscript, unless the manuscript filename was specified as an argument
    most_edited_tex_file=$(git log --pretty=format: --name-only | sort | uniq -c | sort -rg | grep ".tex$" | head -n 1 | awk '{ print $2 }')

    # If we can't find the manuscript filename, then exit.
    if [ ! -f $manuscript_filename ]; then
        echo "Manuscript file $manuscript_filename does not exist."

    # Get the manuscript file associated with the previous revision hash
    previous_manuscript_filename=$(gettempfilename previous)
    git show $previous_revision_hash:$manuscript_filename > $previous_manuscript_filename

    # Use latexdiff to create a difference version
    latexdiff $previous_manuscript_filename $manuscript_filename > $diff_ms_no_file_ext.tex
    rm -f $previous_manuscript_filename

    # Compile the difference file
    pdflatex $diff_ms_no_file_ext.tex > /dev/null 2>&1 
    bibtex $diff_ms_no_file_ext.tex > /dev/null 2>&1
    pdflatex $diff_ms_no_file_ext.tex > /dev/null 2>&1
    pdflatex $diff_ms_no_file_ext.tex > /dev/null 2>&1
    # Remove the intermediate files
    ls $diff_ms_no_file_ext.* | grep -v pdf | xargs rm -f
    echo "Revisions to $manuscript_filename made between $previous_revision"\
         "and $current_revision are highlighted in $diff_ms_no_file_ext.pdf"

Okay, now let’s work with an example of a “real” paper. Here’s the LaTeX for a manuscript:



\title{Fun with git and \LaTeX{}}
\author{Andrew R. Casey, Alice, Bob}


One does not simply write an abstract.

Here is the text of our introduction.

    \alpha = \sqrt{ \beta }

There are many loose seals in the ocean.


Let’s commit this to the repository, and make a note in the commit message that this is version v0.1 of the paper.

mrmagoo:my-paper andycasey$ git add manuscript.tex
mrmagoo:my-paper andycasey$ git commit -m "First draft of paper, so revision v0.1"
[master 8857fdb] First draft of paper, so revision v0.1
 1 file changed, 26 insertions(+)
  create mode 100644 manuscript.tex

I send this version around to my co-authors Alice and Bob, and wait for their responses. Each time someone responds, I implement their suggestions and commit the changes to the repository.

Alice says,.. > You forgot my last name! I think you’re missing a constant from Equation 1. Also, can we use “simply not” instead of “not simply”?

We make the changes, then commit to the repository.

mrmagoo:my-paper andycasey$ git add manuscript.tex 
mrmagoo:my-paper andycasey$ git commit -m "Implemented changes suggested by Alice"
[master 6df5f6f] Implemented changes suggested by Alice
 1 file changed, 2 insertions(+), 2 deletions(-)

Bob says,.. > You should be more explicit in the conclusions. Perhaps mention how Buster should act accordingly? Also, you forgot my last name!

Bob’s suggestions are good, so we implement them. Here’s what the final LaTeX looks like:



\title{Fun with git and \LaTeX{}}
\author{Andrew R. Casey, Alice A. Aaronson, Bob B. Baaronson}


One does simply not write an abstract.

Here is the introduction.

    \alpha = \sqrt{ \beta } + C

There are many loose seals in the ocean. Buster is not allowed to swim in the ocean.


Since Bob is the last of our co-authors, once we’ve implemented his changes we can call this v0.2. Notice in this commit message that Revision v0.2 can be anywhere in the commit message, and is not case-sensitive.

mrmagoo:my-paper andycasey$ git add manuscript.tex
mrmagoo:my-paper andycasey$ git commit -m "Put in changes suggested by Bob. Revision v0.2 ready to be sent out"
Revisions to manuscript.tex made between v0.1 and v0.2 are highlighted in manuscript-revisions-v0.2.pdf
[master de29aa6] Put in changes suggested by Bob. Revision v0.2 ready to be sent out
 1 file changed, 2 insertions(+), 2 deletions(-)

Notice the extra message at the start? Our post-commit hook has run and seen that we have more than one revision in our commit history. It’s found the previous version, made a comparison on the two TeX files and compiled it for us!

Take a look:

mrmagoo:my-paper andycasey$ ls
README              manuscript-revisions-v0.2.pdf   manuscript.tex

Now we can send out the revised version (manuscript.pdf), as well as a PDF with the highlighted changes between version 0.1 and version 0.2 (manuscript-revisions-v0.2.pdf). This will happen anytime you commit with something like revision vX in the commit message. Also you can be as pedantic as you want: revision v1, revision v32.4, revision v0.1.3, etc are all acceptable. Easiest way to create automatic PDF diff files, ever!

Here’s what manuscript-revisions-v0.2.pdf looks like:


This makes it infinitely easier for your co-authors to digest what has changed, and will drastically shorten the turnaround between manuscript revisions. If you were wondering, it doesn’t take the fist TeX file it sees; it finds the TeX file that has been edited the most times in the repository, which is probably your manuscript!

A reasonably well-known astrophysics professor once gave me some unsolicited advice:

“I always told people that if they cited me, I’d buy them a beer for every citation.”

He went on to say that even though he had a very well-known astrophysical relationship named after him, many more people knew him because of his open beverage offer. I thought this was a good idea, and recently I’ve been toying with the new API for the NASA/SAO Astrophysics Data Service. You can check out my code on Github. For the most recent example I’ve written a little script that will check to see if I have any new citations, and will alert me who I owe beer(s) to. Here’s the code:

# coding: utf-8

""" Beers for citations. The new underground currency. """

__author__ = "Andy Casey <acasey@mso.anu.edu.au>"

# Standard library
import httplib
import json
import os
import urllib
from collections import Counter

# Module specific
import ads

# Couple of mutable variables for the reader
author_query = "^Casey, Andrew R."
records_filename = "citations.json"

my_papers = ads.search(author_query)

# How many citations did we have last time this ran?
if not os.path.exists(records_filename):
    all_citations_last_time = {"total": 0}

    with open(records_filename, "r") as fp:
        all_citations_last_time = json.load(fp)

# Build a dictionary with all of our citations
bibcodes, citations = zip(*[(paper.bibcode, paper.citation_count) 
    for paper in my_papers])

all_citations = dict(zip(bibcodes, citations))
all_citations["total"] = sum(citations)

# Check if we have more citations than last time, but only if we have run 
# this script beforehand, too. Otherwise we'll get 1,000 notifications on
# the first time the script has been run
if  all_citations["total"] > all_citations_last_time["total"] \
and len(all_citations_last_time) > 1:

    # Someone has cited us since the last time we checked.
    newly_cited_papers = {}
    for bibcode, citation_count in zip(bibcodes, citations):

        new_citations = citation_count - all_citations_last_time[bibcode]

        if new_citations > 0:
            # Who were the first authors for the new papers that cited us?
            citing_papers = ads.search("citations(bibcode:{0})"
                .format(bibcode), rows=new_citations)
            newly_cited_papers[bibcode] = [paper.author[0] for paper in citing_papers]

    # Ok, so now we have a dictionary (called 'newly_cited_papers') that contains 
    # the bibcodes and names of authors who we owe beers to. But instead, we
    # would like to know how many beers we owe, and who we owe them to.
    beers_owed = Counter(sum(newly_cited_papers.values(), []))

    # Let's not buy ourself beers.
    if my_papers[0].author[0] in beers_owed:
        del beers_owed[my_papers[0].author[0]]

    for author, num_of_beers_owed in beers_owed.iteritems():

        readable_name = " ".join([name.strip() for name in author.split(",")[::-1]])
        this_many_beers = "{0} beers".format(num_of_beers_owed) \
            if num_of_beers_owed > 1 else "a beer"
        message = "You owe {0} {1} because they just cited you!"
            .format(readable_name, this_many_beers)


        if not "PUSHOVER_TOKEN" in os.environ \
        or not "PUSHOVER_USER" in os.environ:
            print("No pushover.net notification sent because PUSHOVER_TOKEN or"
                " PUSHOVER_USER environment variables not found.")

        conn = httplib.HTTPSConnection("api.pushover.net:443")
        conn.request("POST", "/1/messages.json",
            "token": os.environ["PUSHOVER_TOKEN"],
            "user": os.environ["PUSHOVER_USER"],
            "message": message
          }), { "Content-type": "application/x-www-form-urlencoded" })

    print("No new citations!")

# Save these citations
with open(records_filename, "w") as fp:
    json.dump(all_citations, fp)

That script will only work if you already have a ADS 2.0 username and an API key for ADS stored in ~/.ads/dev_key. The first time you run it, you won’t get any notifications. This is just to make sure you don’t get 1,000+ notifications the first time it’s run.

In the above example, it will keep track of your citations to the records_filename. That way you can run this script as frequent as you like (e.g., daily, weekly, monthly) and it will only notice citations since the last time it was run. That lets you set up a cron job really easily, so for example – we can be notified on the first of each month automatically when we’re cited:

andycasey@moron>crontab -l
# m h  dom mon dow   command
  0 7   1   *   *    python beers-for-cites.py 

This is great, but at the moment it will just print out who we owe beer(s) to. In reality if we’re running this as a cron job then we’ll want to be notified somehow. I like to use Pushover.net to send free notifications to my devices. So I’ve created an account and an application called “Beers for Citations”, then put the application token and user as the environment variables PUSHOVER_TOKEN and PUSHOVER_USER. Now I’ll get a notification to my phone when someone cites any of my papers.

The end result looks something like this:


So there you go. Cite any of my papers and I’ll buy you a beer the next time I see you.