My bibliography
Previously, I described how to automatically add references (citations and a bibliography) to Jekyll blog posts. I use a related to method to automatically generate a list of my own publications, which can be seen here.
As I mentioned in that previous post, I have a collection of Python scripts that run when I build my site. This collection also contains the following script, called make_my_bib.py
.
"""Create the bibliography file my_papers.yml.
"""
from Bio import Entrez
import re
def make_my_bib():
"""Grab all my publications and format them into a YAML bibliography.
"""
Entrez.email = "your.email@example.com"
handle = Entrez.esearch(
db="pubmed", sort="date", retmax="200", retmode="xml", term="mathias sr[author]"
)
pmids = Entrez.read(handle)["IdList"]
extras = ["28480992", "28385874", "27744290", "32170019"]
pmids += extras
pmids = set(pmids)
pmids = ",".join(pmids)
handle = Entrez.efetch(db="pubmed", retmode="xml", id=pmids)
papers = Entrez.read(handle)["PubmedArticle"]
data = []
for paper in papers:
article = paper["MedlineCitation"]["Article"]
journal = article["Journal"]
date = journal["JournalIssue"]["PubDate"]
year = date["Year"]
month = "00" if "Month" not in date else date["Month"]
if len(month) == 3:
month = dict(
zip(
[
"Jan",
"Feb",
"Mar",
"Apr",
"May",
"Jun",
"Jul",
"Aug",
"Sep",
"Oct",
"Nov",
"Dec",
],
range(1, 13),
)
)[month]
sort = year + "%02d" % int(month)
ids = paper["PubmedData"]["ArticleIdList"]
authors = []
for _a in article["AuthorList"]:
if "LastName" in _a:
a = _a["LastName"] + ", " + ". ".join(_a["Initials"]) + "."
else:
a = _a["CollectiveName"].rstrip()
authors.append(a)
if len(authors) > 1:
authors[-1] = "& " + authors[-1]
jt = journal["Title"].title().replace("Of The", "of the").replace("And", "and")
jt = jt.replace("Of", "of").replace(" (New York, N.Y. : 1991)", "")
jt = jt.replace(" (New York, N.Y.)", "")
jt = jt.split(" : ")[0]
jt = jt.replace(". ", ": ").replace("In ", "in ").replace(": Cb", "")
jt = jt.replace("Jama", "JAMA")
ji = journal["JournalIssue"]
volume = None if "Volume" not in ji else ji["Volume"]
issue = None if "Issue" not in ji else ji["Issue"]
title = article["ArticleTitle"]
rtn = re.split("([:] *)", title)
title = "".join([i.capitalize() for i in rtn])
keep = ["BP1-BP2", "African", "American", "Americans", "MRI", "QTL", "ENIGMA"]
title = " ".join(s.upper() if s.upper() in keep else s for s in title.split())
k = "Pagination"
first_page = None if k not in article else article[k]["MedlinePgn"]
last_page = None
if first_page:
first_page, *last_page = first_page.split("-")
dic = {
"authors": ", ".join(authors),
"title": title if title[-1] != "." else title[:-1],
"journal": jt,
"year": year,
"sort": sort,
"pmid": str(paper["MedlineCitation"]["PMID"]),
"doi": [str(i) for i in ids if str(i)[:3] == "10."][0].lower(),
}
dic["id"] = dic["pmid"]
if volume:
dic["volume"] = volume
if issue:
dic["issue"] = issue
if first_page:
dic["first_page"] = first_page
if last_page:
dic["last_page"] = last_page[0]
if int(year) >= 2010 and "Correction:" not in title:
data.append(dic)
with open("../../_data/my_papers.yaml", "w") as fw:
for paper in data:
fw.write(f"p{paper['sort'] + paper['title'].split()[0][:2].lower()}:\n")
del paper["id"]
[fw.write(f""" {k}: "{v}"\n""") for k, v in paper.items()]
fw.write(f"\n")
s = "".join(open("../../_data/my_papers_manual.yaml").readlines())
fw.write(s)
if __name__ == "__main__":
make_my_bib()
This script uses the Biopython third-party package—specifically the Entrez subpackage—to grab a list of all my publications from PubMed. It collects metadata from these publications, performs a few hard-coded edits, and adds them to a YAML file called my_papers.yaml
. It also appends data from another YAML file called my_papers_manual.yaml
which, as the name suggests, contains manually entered publications that are not listed on PubMed. Here’s an example item from that file:
p201601re:
authors: "Mathias, S. R., Knowles, E. E., Kent, J. W., McKay, D. R., Curran, J. E., de Almeida, M. A., Dyer, T. D., Göring, H. H., Olvera, R. L., Duggirala, R., Fox, P. T., Almasy, L., Blangero, J., & Glahn, D. C."
title: "Recurrent major depression and right hippocampal volume: A bivariate linkage and association study"
journal: "Human Brain Mapping"
year: "2016"
sort: "201601"
pmid: "26485182"
doi: "10.1002/hbm.23025"
volume: "37"
issue: "1"
first_page: "191"
last_page: "202"
These data are interpreted by following Liquid code embedded within the static page publications.md
.
{% assign papers = site.data.my_papers | sort %}
{% for paper in papers reversed %}
{% include citation.html %}
{% endfor %}
The array papers
is sorted in reverse chronological order so that the most recent publication appears first. See my previous post to understand how citation.html
works.
Version history
- Originally posted September 05, 2020.
Related posts
- “Scholarly references in Jekyll,” Sep 02, 2020.
- “Audiobooks-o-rama,” Jun 24, 2020.
- “Displaying external files in Jekyll,” Aug 18, 2019.
- “Toggling features in Jekyll posts,” Aug 17, 2019.
- All posts filed under jekyll, liquid, python, references.