arXiv NG: incremental decoupling and search

Martin (our IT Team Lead) and I gave a brief update on the arXiv-NG project at the Open Repositories 2018 conference in Bozeman, MT last week. So it seems like a good time to offer an update here, as well. For a high-level refresher on what we’re up to, check out my earlier post. In this post, I’ll provide a bit more detail about how we’re migrating from a legacy code-base to a more evolvable architecture, illustrated by our recent work on our search interface.

The classic (read: legacy) arXiv platform is complex, and the fact that we are in the midst of re-architecting that system in fairly dramatic ways makes it difficult to provide a visual representation of progress. Here is my best attempt, from the perspective of how data “moves” through the arXiv platform. Each of the polygons represents a notional component of the classic system and/or a separate service or application in the NG architecture.

Notional overview of the arXiv system, depicting how data generally “moves” through the platform. This starts in the submission system with new papers, and flows through to a variety of access and discovery interfaces.


Our recent outage

You may have noticed that arXiv was mostly unavailable for several hours on March 26, ultimately leading us to postponing the mailing for that evening. First, please accept our apology for this major service disruption; we know that many of you rely on arXiv as part of your daily workflows. As our second major service outage in 4 months, you may be wondering about arXiv’s long-term reliability. This is certainly something that keeps us up at night (firefighting notwithstanding), and we are actively pursuing options to improve our failover capabilities.

So what happened? Our service provider experienced a major failure with its shared filesystem (SFS) service, causing networked filesystems to suddenly become unavailable for arXiv and numerous other clients at Cornell University. Our service was simply not prepared to handle this type of failure scenario; years of otherwise dependable service had given us a false sense of security, and we ultimately failed to plan for it properly.

After considerable situation assessment and server wrangling, we were eventually able to redirect users to our mirror servers. Once our service provider resolved the problem on their end, we were given the green light to reboot our servers, which restored access to our networked filesystems. No  primary or backup data was lost or corrupted, so we were able to bring the service back to its normal state very shortly after the reboots. Since the outage spanned our scheduled publish cycle, we were regrettably forced to postpone the mailing to the next day–hence no new announcements in your inbox the following morning.

Where do we go from here? In the long term, we have already made architectural moves for arXiv-NG that will prevent this kind of catastrophic outage from taking down the whole system. But we also consider resiliency to this kind of failure to be a high priority in the short term, as well. On the day following the outage, the arXiv development team convened to brainstorm failover options and improvements to our processes, and we have identified specific steps to better handle this type of failure that we will begin implementing over the next few days. This will include changes to how our existing web servers are configured, cluster-level changes to ensure the availability of public interfaces even when networked storage goes down, and incorporating off-site failover options using infrastructure developed for arXiv-NG.

We again apologize for this disruption in service and thank you for your continued support of arXiv!

Annual Update

We are pleased to provide an update with a brief summary of our 2017 activities and 2018 plans:

We remain grateful for strong support from our member organizations, Simons Foundation, and essential contributions from arXiv’s advisory groups as they consistently provide us with input as representatives of scientific and library communities. We salute the contributions of 170 volunteer moderators who are crucial to our operation. Also we’d like to thank the Sloan Foundation and the Heising-Simons Foundation for their generous support of the next generation initiative.

arXiv Team
Oya Y. Rieger (Program Director), Steinn Sigurdsson (Scientific Director), Jim Entwood (Operations Manager), Martin Lessmeister (IT Lead), Sandy Payette (Technology Strategy Advisor), Erick Peirson (Lead Architect), Gail Steinhart (Program Associate), Chloe McLaren (Membership Program Coordinator)


The arXiv received 67 preprints as part of today’s announcement of the discovery by the LIGO/Virgo Collaboration of the coalescence of a binary neutron star in NGC4993, accompanied by a short Gamma Ray Burst.

The preprints were submitted over a period of several days and were held to be released together as a contiguous block on astro-ph and gr-qc.  Two of the preprints were held back because of technical issues leaving a batch of 65 to be released.  The plan to make the release a contiguous block of arXiv IDs failed for technical reasons,  our admins worked late to diagnose the source of the problem, which boiled down to a flaw in the script setting up the block, it assumed implicitly that any such block of preprints would be submitted on the same day…

The list of LVC and EM collaboration preprints we were informed about is below,  there are other manuscripts discussing the event, those are from independent researchers generally unaffiliated with the collaboration:

  1. GW170817: Observation of Gravitational Waves from a Binary Neutron Star Inspiral:
  2. Multi-messenger Observations of a Binary Neutron Star Merger:
  3. Gravitational Waves and Gamma-rays from a Binary Neutron Star Merger: GW170817 and GRB 170817A:
  4. A gravitational-wave standard siren measurement of the Hubble constant:
  5. Estimating the Contribution of Dynamical Ejecta in the Kilonova Associated with GW170817:
  6. GW170817: Implications for the Stochastic Gravitational-Wave Background from Compact Binary Coalescences:
  7. On the Progenitor of Binary Neutron Star Merger GW170817:
  8. Search for High-energy Neutrinos from Binary Neutron Star Merger GW170817 with ANTARES, IceCube, and the Pierre Auger Observatory:
  9. Fermi-LAT observations of the LIGO/Virgo event GW170817:
  10. An Ordinary Short Gamma-Ray Burst with Extraordinary Implications: Fermi-GBM Detection of GRB 170817A:
  11. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/Virgo GW170817. I. Dark Energy Camera Discovery of the Optical Counterpart:
  12. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/VIRGO GW170817. II. UV, Optical, and Near-IR Light Curves and Comparison to Kilonova Models:
  13. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/VIRGO GW170817. III. Optical and UV Spectra of a Blue Kilonova From Fast Polar Ejecta:
  14. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/VIRGO GW170817. IV. Detection of Near-infrared Signatures of r-process Nucleosynthesis with Gemini-South:
  15. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/VIRGO GW170817. V. Rising X-ray Emission from an Off-Axis Jet:
  16. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/VIRGO GW170817. VI. Radio Constraints on a Relativistic Jet and Predictions for Late-Time Emission from the Kilonova Ejecta:
  17. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/VIRGO GW170817. VII. Properties of the Host Galaxy and Constraints on the Merger Timescale:
  18. The Electromagnetic Counterpart of the Binary Neutron Star Merger LIGO/VIRGO GW170817. VIII. A Comparison to Cosmological Short-duration Gamma-ray Bursts:
  19. Swope Supernova Survey 2017a (SSS17a), the Optical Counterpart to a Gravitational Wave Source:
  20. Light Curves of the Neutron Star Merger GW170817/SSS17a: Implications for R-Process Nucleosynthesis:
  21. Early Spectra of the Gravitational Wave Source GW170817: Evolution of a Neutron Star Merger:
  22. The Unprecedented Properties of the First Electromagnetic Counterpart to a Gravitational Wave Source:
  23. Origin of the heavy elements in binary neutron-star mergers from a gravitational wave event:
  24. The Old Host-Galaxy Environment of SSS17a, the First Electromagnetic Counterpart to a Gravitational Wave Source:
  25. Electromagnetic Evidence that SSS17a is the Result of a Binary Neutron Star Merger:
  26. A Neutron Star Binary Merger Model for GW170817/GRB170817a/SSS17a:
  27. Illuminating Gravitational Waves: A Concordant Picture of Photons from a Neutron Star Merger:
  28. A Radio Counterpart to a Neutron Star Merger:
  29. Swift and NuSTAR observations of GW170817: detection of a blue kilonova:
  30. The X-ray counterpart to the gravitational wave event GW 170817:
  31. A kilonova as the electromagnetic counterpart to a gravitational-wave source:
  32. Optical Follow-up of Gravitational-wave Events with Las Cumbres Observatory:
  33. Optical emission from a kilonova following a gravitational-wave-detected neutron-star merger:
  34. Observations of the first electromagnetic counterpart to a gravitational wave source by the TOROS collaboration:
  35. The Emergence of a Lanthanide-Rich Kilonova Following the Merger of Two Neutron Stars:
  36. How Many Kilonovae Can Be Found in Past, Present, and Future Survey Datasets?:
  37. Optical Observations of LIGO Source GW 170817 by the Antarctic Survey Telescopes at Dome A, Antarctica:
  38. Follow up of GW170817 and its electromagnetic counterpart by Australian-led observing programs:
  39. ALMA and GMRT constraints on the off-axis gamma-ray burst 170817A from the binary neutron star merger GW170817:
  40. J-GEM observations of an electromagnetic counterpart to the neutron star merger GW170817:
  41. The unpolarized macronova associated with the gravitational wave event GW170817:
  42. Kilonova from post-merger ejecta as an optical and near-infrared counterpart of GW170817:
  43. MASTER optical detection of the first LIGO/Virgo neutron stars merging GW170817:
  44. A peculiar low-luminosity short gamma-ray burst from a double neutron star merger progenitor:
  45. AGILE Observations of the Gravitational Wave Source GW 170817: Constraining Gamma-Ray Emission from a NS-NS Coalescence:
  46. The Diversity of Kilonova Emission in Short Gamma-Ray Bursts:
  47. The environment of the binary neutron star merger GW170817:
  48. The first direct double neutron star merger detection: implications for cosmic nucleosynthesis:
  49. A Deep Chandra X-ray Study of Neutron Star Coalescence GW170817:
  50. Afterglows and Macronovae Associated with Nearby Low-Luminosity Short-Duration Gamma-Ray Bursts: Application to GW170817/GRB170817A:
  51. GRB170817A associated with GW170817: multifrequency observations and modeling of prompt gamma-ray emission:
  52. INTEGRAL Detection of the First Prompt Gamma-Ray Signal Coincident with the Gravitational Wave Event GW170817:
  53. The Rapid Reddening and Featureless Optical Spectra of the optical counterpart of GW170817, AT 2017gfo, During the First Four Days:
  54. The discovery of the electromagnetic counterpart of GW170817: kilonova AT 2017gfo/DLT17ck:
  55. A comparison between SALT/SAAO observations and kilonova models for AT 2017gfo: the first electromagnetic counterpart of a gravitational wave transient – GW170817:
  56. The Distance to NGC 4993: The Host Galaxy of the Gravitational-wave Event GW170817:
  57. GRB 170817A as a jet counterpart to gravitational wave trigger GW 170817:
  58. Spectroscopic identification of r-process nucleosynthesis in a double neutron star merger:
  59. Jet-driven and jet-less fireballs from compact binary mergers:
  60. Multimessenger tests of the weak equivalence principle from GW170817 and its electromagnetic counterparts:
  61. Distance and properties of NGC 4993 as the host galaxy of a gravitational wave source, GW170817:
  62. TeV gamma-ray observations of the binary neutron star merger GW170817 with H.E.S.S:
  63. Lanthanides or dust in kilonovae: lessons learned from GW170817:
  64. An empirical limit on the kilonova rate from the DLT40 one day cadence Supernova Survey:
  65. Subaru Hyper Suprime-Cam Survey for An Optical Counterpart of GW170817:

Donate to arXiv

arXiv will run an online fundraising campaign for four days, from October 16-19, 2017, to help raise additional funds (see Donations to arXiv). arXiv’s baseline maintenance costs are sponsored by more than 210 member organizations, the Simons Foundation, and Cornell University Library. This online campaign aims to garner additional resources from the program’s active and supportive user base.  Stewardship of resources such as arXiv involves not only covering the operational costs but also continuing to enhance their value based on the needs of the user community and the evolving patterns and modes of scholarly communication. It is essential to raise additonal funds in order to fund new initiatives that are beyond the routine operational work, and to robustly support arXiv’s Open Access mission. Donations to arXiv are tax deductible, eligible for employer matches via benevity, and easy to schedule.

Donations can be made here. We thank you for your support.

Next-Gen arXiv Initiative Update

The goal of the next-gen arXiv (arXiv-NG) initiative is to improve the architecture of the service’s core infrastructure. Funded by the Sloan Foundation, the ongoing Phase I planning activities (December 2016-May 2018) aims to evaluate different scenarios for the overall development and replacement process, settle on a specific one, and initiative technical development work. At this stage, we have largely settled on a strategy of incremental and modular renewal of the existing arXiv system (“Classic Renewal”), rather than building an entirely new system and migrating to it. Currently, we are in the process of assessing various technology components as well as identifying collaboration and partnership options. We will put in place initial testing and implementation of some key technology components and develop new organizational and staffing models to ensure continuity in operations. The completion of the arXiv-NG initiative is anticipated to take approximately three years. Although the funding streams are separate, the arXiv team will take an integrated approach to consider the current operational system (Classic arXiv) and the next-gen system as a unified program. This approach is essential as we recruit and retain staff who will need to be conversant with the old and new systems, and transitioning from one to the other. We will strategically expand the core arXiv development team to bring new skills to the arXiv-NG project, while continuing to provide excellent support and maintenance of the production arXiv system.

Stay tuned as we’ll keep you informed of new developments!

Originally published 31 May 2017.

Annual update

We have posted an update with a brief summary of our 2016 activities and plans for 2017:

arXiv Team: Oya Y. Rieger (Program Director), Jim Entwood (Operations Manager), Chloe McLaren (Membership Program Coordinator), Sandy Payette (CTO), Gail Steinhart (Program Associate)

Originally published 13 Jan 2017.

2017 roadmap ready

The 2017 roadmap is now available:

2017 arXiv Roadmap

The arXiv roadmap is a living document and communication tool to accommodate continuous prioritization throughout the year. Items are listed in approximate priority order, subject to change based upon consideration of input from arXiv stakeholders, consideration of new opportunities and initiatives as they arise, and progress on next generation arXiv development.

Originally published 11 Jan 2017.

2017 budget now available

The projected 2017 arXiv budget is now available on the budgets page.

Please note, the reserve balances reflect the 2016 starting balances. These will be revised to accurate 2017 starting balances when the 2016 budget is closed (in Jan-Feb, 2017).

Originally published 16 Dec 2016.

Alfred P. Sloan Foundation awards grant for arXiv upgrade

We’re pleased to announced the first phrase of a three-year overhaul and modernization, with the help of a $445,000 grant by the Alfred P. Sloan Foundation. Read the full press release.

Originally published 12 Dec 2016.

