This page collects useful resources regarding Experiment-Oriented Computing (EOC), a concept introduced in my ESEC/FSE 2018 paper The Case for Experiment-Oriented Computing. Computational experimentation technology can be found in many forms, sometimes explicit and dedicated, but more often intertwined with other concerns. In almost all cases I’m aware of, however, there is no proper understanding of the wide scope that such technology can have. Nevertheless, it is useful to map the technology that does exist, since it can help us track, motivate and imagine the progress of proper EOC tools and systems. Please leave a comment if you would like to point out other references.
A/B Test Libraries, Frameworks and Services
In Software Engineering, experimentation is often confused with mere A/B testing. This is actually a very popular technique, so it would be futile to try to curate all such tools here. Rather, I will focus on those that for some reason are particularly interesting or representative.
- Facebook’s PlanOut: a framework for large-scale A/B testing, used at Facebook.
- Optimizely: A popular service that calls itself “the world’s leading experimentation platform.” It allows non-programmers to take existing web pages and modify them in order to determine the effects of such modifications. This is achieved by dynamically instrumenting the page before delivering to customers. Other forms of experimentation are also possible, some relating to personalization of pages with respect to users, locations and perhaps other factors.
- Unbounce: A popular service to design and deploy landing pages (and other artifacts, it seems). Critically, allows for easy A/B testing, provided that the user spends some time designing the versions to be tested.
Experiment Tracking and Versioning
- Version Control System for Data Science Projects: Uses git to track experiments. Includes a number of additional abstractions, such as metrics and Machine Learning pipelines, in order to allow easier assessment and reproduction of results.
- Sacred: Python library to define, run and track experiments. “Sacred is a tool to configure, organize, log and reproduce computational experiments. It is designed to introduce only minimal overhead, while encouraging modularity and configurability of experiments.”
- Sumatra: Primaraly a command-line tool for “managing and tracking projects based on numerical simulation and/or analysis, with the aim of supporting reproducible research. It can be thought of as an automated electronic lab notebook for computational projects.” Also provides a Python library for deeper integrations and customizations.
- Experimenter: Uses git versioning in order to track both experimental setup and results.
Reccrd:Used to be a Python library and a related online service to store experimentation results. The link, however, no longer points to the right place.
Experimentation with Users
- Amazon’s Mechanical Turk: One of the most well-known platforms to recruit users to complete arbitrary tasks online. Obviously useful for experimenting with users. For example, Toomin et al. have studied user preferences by performing experiments using Mechanical Turk.
- Clickworker: an alternative to Mechanical Turk, with various pre-defined use cases.
- What-If Tool: Part of TensorBoard, allows users to interact with model features and exemples in order to quickly assess their effect in learning. This is manual tool, designed to allow users to manipulate and understand models in real-time.
Real-World Applications, Laboratories and Companies
- Experimentation at Uber
- Related presentation: A/B testing at Uber: How we built a BYOM (bring your own metrics) platform
- Experimentation to optimize Pinterest’s recommendation system: Diversification in recommender systems: Using topical variety to increase user satisfaction
- Microsoft’s ExP Experimentation Platform. There’s a lot of original papers, talks and other contents here, particularly regarding A/B testing.
The nature of experimentation itself:
- Radder, H. (2009).The philosophy of scientific experimentation: a review. Automated Experimentation, 1(1), 2.
- Kohavi, R., & Longbotham, R. (2017). Online controlled experiments and a/b testing. In Encyclopedia of machine learning and data mining (pp. 922-929). Springer US.
- Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B. E., Bussonnier, M., Frederic, J., … & Ivanov, P. (2016, May). Jupyter Notebooks-a publishing format for reproducible computational workflows. In ELPUB (pp. 87-90).
- Shen, H. (2014). Interactive notebooks: Sharing the code. Nature News, 515(7525), 151.
- Bakshy, E., Eckles, D., & Bernstein, M. S. (2014, April). Designing and deploying online field experiments. In Proceedings of the 23rd international conference on World wide web (pp. 283-292). ACM.
- Silva, M., Hines, M. R., Gallo, D., Liu, Q., Ryu, K. D., & Da Silva, D. (2013, March). Cloudbench: Experiment automation for cloud environments. In Cloud Engineering (IC2E), 2013 IEEE International Conference on (pp. 302-311). IEEE.
- Sparkes, A., Aubrey, W., Byrne, E., Clare, A., Khan, M. N., Liakata, M., … & Young, M. (2010). Towards Robot Scientists for autonomous scientific discovery. Automated Experimentation, 2(1), 1.
- Hunter, D., & Evans, N. (2016). Facebook emotional contagion experiment controversy. Research Ethics, 12(1), 2–3.
(Computational) Scientific Discovery:
- Langley, P., Simon, H. A., Bradshaw, G. L., & Zytkow, J. M. (1987). Scientific discovery: Computational explorations of the creative processes. MIT press.
Software analytics (and related experimental concerns). Although, in principle, software analytics can be entirely passive (and therefore not experimental), in reality software provides an ideal medium for supporting experimentation (i.e., because arbitrary interaction can be implemented). Hence, it is worth understanding the area.
- Bird, C., Menzies, T., & Zimmermann, T. (Eds.). (2015). The Art and Science of Analyzing Software Data. Elsevier.
- Menzies, T., Williams, L., & Zimmermann, T. (2016). Perspectives on Data Science for Software Engineering. Morgan Kaufmann.
Philosophy of Science. Unsurprisingly, I find the discipline to be quite insightful.
- Hanson, N. R. (1977). Patterns of discovery an inquiry into the conceptual foundations of science. CUP Archive.