Collaborative analysis over massive time series data sets - Ed Duarte

Master's Thesis

Collaborative analysis over massive time series data sets
The full text of this thesis is currently under embargo until the 21st of December 2021, but can still be provided under some circumstances by requesting it here.

The recent expansion of metrification on a daily basis has led to the production of massive quantities of data, and in many cases, these collected metrics are only useful for knowledge building when seen as a full sequence of data ordered by time, which constitutes a time series. To find and interpret meaningful behavioral patterns in time series, a multitude of analysis software tools have been developed. Many of the existing solutions use annotations to enable the curation of a knowledge base that is shared between a group of researchers over a network. However, these tools also lack appropriate mechanisms to handle a high number of concurrent requests and to properly store massive data sets and ontologies, as well as suitable representations for annotated data that are visually interpretable by humans and explorable by automated systems.

The goal of the work presented in this dissertation is to iterate on existing time series analysis software and build a platform for the collaborative analysis of massive time series data sets, leveraging state-of-the-art technologies for querying, storing and displaying time series and annotations. A theoretical and domain-agnostic model was proposed to enable the implementation of a distributed, extensible, secure and high-performant architecture that handles various annotation proposals in simultaneous and avoids any data loss from overlapping contributions or unsanctioned changes. Analysts can share annotation projects with peers, restricting a set of collaborators to a smaller scope of analysis and to a limited catalog of annotation semantics.

Annotations can express meaning not only over a segment of time, but also over a subset of the series that coexist in the same segment. A novel visual encoding for annotations is proposed, where annotations are rendered as arcs traced only over the affected series’ curves in order to reduce visual clutter.

Moreover, the implementation of a full-stack prototype with a reactive web interface was described, directly following the proposed architectural and visualization model while applied to the HVAC domain. The performance of the prototype under different architectural approaches was benchmarked, and the interface was tested in its usability. Overall, the work described in this dissertation contributes with a more versatile, intuitive and scalable time series annotation platform that streamlines the knowledge-discovery workflow.

Outline

  1. Introduction overviews the various concepts surrounding time series analysis, time series visualization techniques, digital annotations (and how these have been used in time series analysis), and distributed systems;

  2. State of the art presents an in-depth overview of state of the art methodologies and technologies currently applied to analysis, storage and visualization of time series and annotations;

  3. Proposed Model proposes a blueprint for the development of a time series analysis and annotation platform at the theoretical level, describing an ideal architecture and visualization tool for handling both time series and ontology data;

  4. Implementation iterates on the blueprint proposed in Chapter 3 by implementing and describing a working prototype for a collaborative time series analysis architecture and web application, listing the specific tools that were employed for it and the techniques that allowed further optimized usage of state of the art technologies to handle the mentioned requirements;

  5. Evaluation takes the implemented platform and benchmarks its features, in order to evaluate how its architecture handles realistic scenarios and how it compares with other potential architectures, as well as how its interface adheres to interaction design standards;

  6. Conclusion and future work gives an account of the observed behaviors and caveats in the prototype during development and evaluation phases, relates the ways in which the proposed model for time series annotation improves on existing tools, and leaves a few clues to how the proposed platform can be iterated on in order to extend its capabilities and improve its overall performance and quality.

Publication

I submitted this study as a dissertation (equivalent to a thesis in the US) for the Master’s degree in Software Engineering at the University of Aveiro, passing with distinction with a grade of 19 out of 20, which in the US grading system corresponds to A (with A+ being the highest and F being lowest). The full text can be read on the institutional repository of the University of Aveiro.

To cite this research, you may use the following BibTex record:

@masterthesis{EdDuarte/master-thesis2018,
  author = {Duarte, Eduardo Miguel Oliveira},
  title = {Collaborative analysis over massive time series data sets},
  year = {2018}
}

Acknowledgements

The present study was developed in the scope of the Smart Green Homes Project [POCI-01-0247 FEDER-007678], a co-promotion between Bosch Termotecnologia S.A. and the University of Aveiro. It is financed by Portugal 2020, under the Competitiveness and Internationalization Operational Program, and by the European Regional Development Fund.

This work is licensed under a Creative Commons Attribution 4.0 International License.


  1. S. K. Jensen, T. B. Pedersen, and C. Thomsen, “Time series management systems: A survey”, IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 11, pp. 2581–2600, 2017, issn: 1041-4347. doi: https://doi.org/10.1109/TKDE.2017.2740932.
  2. A. Bader, O. Kopp, and M. Falkenthal, “Survey and comparison of open source time series databases”, in Datenbanksysteme für Business, Technologie und Web (BTW 2017) Workshopband, B. Mitschang, D. Nicklas, F. Leymann, H. Schöning, M. Herschel, J. Teubner, T. Härder, O. Kopp, and M. Wieland, Eds., Bonn: Gesellschaft für Informatik e.V., 2017, pp. 249–268.
  3. D. Laney, 3d data management: Controlling data volume, variety and velocity, 2001.
  4. M. Blount, M. Ebling, J Eklund, A. James, C. Mcgregor, N. Percival, K. Smith, and D. Sow, “Realtime analysis for intensive care: Development and deployment of the artemis analytic system”, IEEE Engineering in Medicine and Biology Magazine, vol. 29, no. 2, pp. 110–118, 2010, issn: 0739-5175. doi: https://doi.org/10.1109/MEMB.2010.936454.
  5. R. D. O’Reilly, “A distributed architecture for the monitoring and analysis of time series data”, 2015.
  6. T. chung Fu, “A review on time series data mining”, Engineering Applications of Artificial Intelligence, vol. 24, no. 1, pp. 164 –181, 2011, issn: 0952-1976. doi: https://doi.org/10.1016/j.engappai.2010.09.007.
  7. J. Lin, E. Keogh, S. Lonardi, J. P. Lankford, and D. M. Nystrom, “Visually mining and monitoring massive time series”, in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ’04, Seattle, WA, USA: ACM, 2004, pp. 460–469, isbn: 1-58113888-1. doi: https://doi.org/10.1145/1014052.1014104.
  8. D. Sow, A. Biem, M. Blount, M. Ebling, and O. Verscheure, “Body sensor data processing using stream computing”, in Proceedings of the International Conference on Multimedia Information Retrieval, ser. MIR ’10, Philadelphia, Pennsylvania, USA: ACM, 2010, pp. 449–458, isbn: 978-1-60558-815-5. doi: https://doi.org/10.1145/1743384.1743465.
  9. H. Han, H. C. Ryoo, and H. Patrick, “An infrastructure of stream data mining, fusion and management for monitored patients”, in 19th IEEE Symposium on Computer-Based Medical Systems (CBMS’06), 2006, pp. 461–468. doi: https://doi.org/10.1109/CBMS.2006.39.
  10. V. Nenov and J. Klopp, “Remote analysis of physiological data from neurosurgical icu patients”, Journal of the American Medical Informatics Association, vol. 3, no. 5, pp. 318–327, 1996. doi: https://doi.org/10.1136/jamia.1996.97035023.
  11. C. Mcgregor, D. Sow, A. James, B., M. Ebling, E., J., and K. Smith, “Collaborative research on an intensive care decision support system utilizing physiological data streams”, Jan. 2009.
  12. P. D. Healy, R. D. O’Reilly, G. B. Boylan, and J. P. Morrison, “Interactive annotations to support collaborative analysis of streaming physiological data”, in 2011 24th International Symposium on Computer-Based Medical Systems (CBMS), 2011, pp. 1–5. doi: https://doi.org/10.1109/CBMS.2011.5999131.
  13. A. Bar-Or, J. Healey, L. Kontothanassis, and J. M. V. Thong, “Biostream: A system architecture for real-time processing of physiological signals”, in The 26th Annual International Conference of the 115 IEEE Engineering in Medicine and Biology Society, vol. 2, 2004, pp. 3101–3104. doi: https://doi.org/10.1109/IEMBS.2004.1403876.
  14. E. Hadavandi, H. Shavandi, and A. Ghanbari, “Integration of genetic fuzzy systems and artificial neural networks for stock price forecasting”, Knowledge-Based Systems, vol. 23, no. 8, pp. 800 –808, 2010, issn: 0950-7051. doi: https://doi.org/10.1016/j.knosys.2010.05.004.
  15. S. K. Badam, J. Zhao, S. Sen, N. Elmqvist, and D. Ebert, “Timefork: Interactive prediction of time series”, in Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, ser. CHI ’16, San Jose, California, USA: ACM, 2016, pp. 5409–5420, isbn: 978-1-4503-3362-7. doi: https://doi.org/10.1145/2858036.2858150.
  16. S. Kamburugamuve, P. Wickramasinghe, S. Ekanayake, C. Wimalasena, M. Pathirage, and G. C. Fox, “Tsmap3d: Browser visualization of high dimensional time series data”, 2016 IEEE International Conference on Big Data (Big Data), pp. 3583–3592, 2016.
  17. A. Chourasia, K. B. Richards-Dinger, J. H. Dieterich, and Y. Cui, “Visual exploration and analysis of time series earthquake data”, in Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact, ser. PEARC17, New Orleans, LA, USA: ACM, 2017, 48:1–48:6, isbn: 978-1-4503-5272-7. doi: https://doi.org/10.1145/3093338.3093366.
  18. J. Lee, D. Rowlands, N. Jackson, R. Leadbetter, T. Wada, and D. James, “An architectural based framework for the distributed collection, analysis and query from inhomogeneous time series data sets and wearables for biofeedback applications”, Algorithms, vol. 10, no. 4, p. 23, 2017, issn: 1999-4893. doi: https://doi.org/10.3390/a10010023.
  19. J. Lin, E. Keogh, S. Lonardi, J. P. Lankford, and D. M. Nystrom, “Viztree: A tool for visually mining and monitoring massive time series databases”, in In Proceedings of International Conference on Very Large Data Bases, 2004, pp. 1269–1272.
  20. F. D. Turck, J. Decruyenaere, P. Thysebaert, S. V. Hoecke, B. Volckaert, C. Danneels, K. Colpaert, and G. D. Moor, “Design of a flexible platform for execution of medical decision support agents in the intensive care unit”, Computers in Biology and Medicine, vol. 37, no. 1, pp. 97 –112, 2007, issn: 0010-4825. doi: https://doi.org/10.1016/j.compbiomed.2005.10.004 . Available: http://www.sciencedirect.com/science/article/pii/S0010482505001216.
  21. H. González-Vélez, M. Mier, M. Julià-Sapé, T. N. Arvanitis, J. M. García-Gómez, M. Robles, P. H. Lewis, S. Dasmahapatra, D. Dupplaw, A. Peet, C. Arús, B. Celda, S. Van Huffel, and M. Lluch-Ariet, “Healthagents: Distributed multi-agent brain tumor diagnosis and prognosis”, Applied Intelligence, vol. 30, no. 3, pp. 191–202, 2009, issn: 1573-7497. doi: https://doi.org/10.1007/s10489-007-0085-8.
  22. D. A. Keim, F. Mansmann, J. Schneidewind, and H. Ziegler, “Challenges in visual data analysis”, in Tenth International Conference on Information Visualisation (IV’06), 2006, pp. 9–16. doi: https://doi.org/10.1109/IV.2006.31.
  23. W. Playfair, The commercial and political atlas: Representing by means of stained copper-plate charts the progress of the commerce revenues expenditure and debts of england during the whole of the eighteenth century, 1786.
  24. H. Hochheiser and B. Shneiderman, “Dynamic query tools for time series data sets: Timebox widgets for interactive exploration”, Information Visualization, vol. 3, no. 1, pp. 1–18, 2004, issn: 1473-8716. doi: https://doi.org/10.1145/993176.993177.
  25. G. Ellis and A. Dix, “A taxonomy of clutter reduction for information visualisation”, IEEE Transactions on Visualization and Computer Graphics, vol. 13, no. 6, pp. 1216–1223, 2007, issn: 1077-2626. doi: https://doi.org/10.1109/TVCG.2007.70535.
  26. W. Javed, B. McDonnel, and N. Elmqvist, “Graphical perception of multiple time series”, IEEE Transactions on Visualization and Computer Graphics, vol. 16, no. 6, pp. 927–934, 2010, issn: 10772626. doi: https://doi.org/10.1109/TVCG.2010.162.116
  27. Y. Chen, P. Xu, and L. Ren, “Sequence synopsis: Optimize visual summary of temporal event data”, IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 45–55, 2018, issn: 1077-2626. doi: https://doi.org/10.1109/TVCG.2017.2745083.
  28. J. Bernard, T. Ruppert, M. Scherer, T. Schreck, and J. Kohlhammer, “Guided discovery of interesting relationships between time series clusters and metadata properties”, in Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies, ser. i-KNOW ’12, Graz, Austria: ACM, 2012, 22:1–22:8, isbn: 978-1-4503-1242-4. doi: https://doi.org/10.1145/2362456.2362485.
  29. Y. Keraron, A. Bernard, and B. Bachimont, “Annotations to improve the using and the updating of digital technical publications”, vol. 20, pp. 157–170, Sep. 2009.
  30. C. Marshall, “Annotation: From paper books to the digital library”, in Proceedings of the Second ACM International Conference on Digital Libraries, ser. DL ’97, Philadelphia, Pennsylvania, USA: ACM, 1997, pp. 131–140, isbn: 0-89791-868-1. doi: https://doi.org/10.1145/263690.263806.
  31. C. Marshall, “Toward an ecology of hypertext annotation”, in Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia: Links, Objects, Time and Space—structure in Hypermedia Systems: Links, Objects, Time and Space—structure in Hypermedia Systems, ser. HYPERTEXT ’98, Pittsburgh, Pennsylvania, USA: ACM, 1998, pp. 40–49, isbn: 0-89791-972-6. doi: https://doi.org/10.1145/276627.276632.
  32. I. A. OVSIANNIKOV, M. A. ARBIB, and T. H. MCNEILL, “Annotation technology”, International Journal of Human-Computer Studies, vol. 50, no. 4, pp. 329 –362, 1999, issn: 1071-5819. doi: https://doi.org/10.1006/ijhc.1999.0247 . Available: http://www.sciencedirect.com/science/article/pii/S1071581999902471.
  33. M. Zacklad, “Documentarisation processes in documents for action (dofa): The status of annotations and associated cooperation technologies”, Computer Supported Cooperative Work (CSCW), vol. 15, no. 2, pp. 205–228, 2006, issn: 1573-7551. doi: https://doi.org/10.1007/s10606-006-9019-y.
  34. S. Bringay, C. Barry, and J. Charlet, “Annotations for the collaboration of the health professionals”, in AMIA Annual Symposium Proceedings, American Medical Informatics Association, vol. 2006, 2006, p. 91.
  35. Y. Keraron, A. Bernard, and B. Bachimont, “Annotations to improve the using and the updating of digital technical publications”, vol. 20, pp. 157–170, Sep. 2009.
  36. N. Bricon-Souf, S. Bringay, S. Hamek, F. Anceaux, C. Barry, and J. Charlet, “Informal notes to support the asynchronous collaborative activities”, International Journal of Medical Informatics, vol. 76, S342 –S348, 2007, Ubiquity: Technologies for Better Health in Aging Societies MIE 2006, issn: 1386-5056. doi: https://doi.org/10.1016/j.ijmedinf.2007.02.006 . Available: http://www.sciencedirect.com/science/article/pii/S1386505607000469.
  37. R. Kawase, E. Herder, and W. Nejdl, “A comparison of paper-based and online annotations in the workplace”, in Learning in the Synergy of Multiple Disciplines, U. Cress, V. Dimitrova, and M. Specht, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 240–253, isbn: 978-3-642-04636-0.
  38. T. Guyet, C. Garbay, and M. Dojat, “Knowledge construction from time series data using a collaborative exploration system”, Journal of Biomedical Informatics, vol. 40, no. 6, pp. 672 –687, 2007, Intelligent Data Analysis in Biomedicine, issn: 1532-0464. doi: https://doi.org/10.1016/j.jbi.2007.09.006 . Available: http://www.sciencedirect.com/science/article/pii/S1532046407001050.
  39. D. A. Kalogeropoulos, E. R. Carson, and P. O. Collinson, “Towards knowledge-based systems in clinical practice: Development of an integrated clinical information and knowledge management support system”, Computer Methods and Programs in Biomedicine, vol. 72, no. 1, pp. 65 –80, 2003, issn: 0169-2607. doi: https://doi.org/10.1016/S0169-2607(02)00118-9 . Available: http://www.sciencedirect.com/science/article/pii/S0169260702001189.
  40. W. B. S. Pressly Jr., “Tspad: A tablet-pc based application for annotation and collaboration on time series data”, in Proceedings of the 46th Annual Southeast Regional Conference on XX, ser. ACM-SE 46, 117 Auburn, Alabama: ACM, 2008, pp. 527–528, isbn: 978-1-60558-105-7. doi: https://doi.org/10.1145/1593105.1593249.
  41. J. Y. Halpern and Y. Moses, “Knowledge and common knowledge in a distributed environment”, J. ACM, vol. 37, no. 3, pp. 549–587, Jul. 1990, issn: 0004-5411. doi: https://doi.org/10.1145/79147.79161.
  42. J. Waldo, J. Waldo, G. Wyant, G. Wyant, A. Wollrath, A. Wollrath, S. Kendall, and S. Kendall, “A note on distributed computing”, IEEE Micro, Tech. Rep., 1994.
  43. R. van Renesse and F. B. Schneider, “Chain replication for supporting high throughput and availability”, in Proceedings of the 6th Conference on Symposium on Opearting Systems Design & Implementation Volume 6, ser. OSDI’04, San Francisco, CA: USENIX Association, 2004, pp. 7–7. Available: http://dl.acm.org/citation.cfm?id=1251254.1251261.
  44. F. T. Leighton and D. M. Lewin, Content delivery network using edge-of-network servers for providing content delivery to a set of participating content providers, US Patent 6,553,413, 2003.
  45. S. Gilbert and N. Lynch, “Brewer’s conjecture and the feasibility of consistent, available, partitiontolerant web services”, SIGACT News, vol. 33, no. 2, pp. 51–59, Jun. 2002, issn: 0163-5700. doi: https://doi.org/10.1145/564585.564601.
  46. D. Abadi, “Consistency tradeoffs in modern distributed database system design: Cap is only part of the story”, Computer, vol. 45, no. 2, pp. 37–42, 2012.
  47. I. Shafer, R. R. Sambasivan, A. Rowe, and G. R. Ganger, “Specialized storage for big numeric time series”, in Presented as part of the 5th USENIX Workshop on Hot Topics in Storage and File Systems, San Jose, CA: USENIX, 2013. Available: https://www.usenix.org/conference/hotstorage13/workshop-program/presentation/Shafer.
  48. C. Pungilă, T.-F. Fortiş, and O. Aritoni, “Benchmarking database systems for the requirements of sensor readings”, IETE Technical Review, vol. 26, no. 5, pp. 342–349, 2009. doi: https://doi.org/10.4103/0256-4602.55279.
  49. T. W. Wlodarczyk, “Overview of time series storage and processing in a cloud environment”, in 4th IEEE International Conference on Cloud Computing Technology and Science Proceedings, 2012, pp. 625–628. doi: https://doi.org/10.1109/CloudCom.2012.6427510.
  50. T. Goldschmidt, A. Jansen, H. Koziolek, J. Doppelhamer, and H. P. Breivold, “Scalability and robustness of time-series databases for cloud-native monitoring of industrial processes”, in 2014 IEEE 7th International Conference on Cloud Computing, 2014, pp. 602–609. doi: https://doi.org/10.1109/CLOUD.2014.86.
  51. A. K. Kalakanti, V. Sudhakaran, V. Raveendran, and N. Menon, “A comprehensive evaluation of nosql datastores in the context of historians and sensor data analysis”, in 2015 IEEE International Conference on Big Data (Big Data), 2015, pp. 1797–1806. doi: https://doi.org/10.1109/BigData.2015.7363952.
  52. D. W. Curtis, E. J. Pino, J. M. Bailey, E. I. Shih, J. Waterman, S. A. Vinterbo, T. O. Stair, J. V. Guttag, R. A. Greenes, and L. Ohno-Machado, “Smart—an integrated wireless system for monitoring unattended patients”, Journal of the American Medical Informatics Association, vol. 15, no. 1, pp. 44–53, 2008. doi: https://doi.org/10.1197/jamia.M2016.
  53. P. D. Healy, R. D. O’Reilly, G. B. Boylan, and J. P. Morrison, “Web-based remote monitoring of live eeg”, in The 12th IEEE International Conference on e-Health Networking, Applications and Services, 2010, pp. 169–174. doi: https://doi.org/10.1109/HEALTH.2010.5556574.
  54. A.-E. Rizzoli, G. Schimak, M. Donatelli, and J. Hřebíček, “Tatoo: Tagging environmental resources on the web by semantic annotations”, 2010.
  55. T. Pariente, J. M. Fuentes, M. A. Sanguino, S. Yurtsever, G. Avellino, A. E. Rizzoli, and S. Nešić, “A model for semantic annotation of environmental resources: The tatoo semantic framework”, in International Symposium on Environmental Software Systems, Springer, 2011, pp. 419–427. 118
  56. L. O. Batista and C. B. Medeiros, “Supporting the study of correlations between time series via semantic annotations”, 2014.
  57. L. O. Batista and C. B. Medeiros, “Searching time series via semantic annotations”, 2013.
  58. J. Park, D. Nguyen, and R. Sandhu, “A provenance-based access control model”, in 2012 Tenth Annual International Conference on Privacy, Security and Trust, 2012, pp. 137–144. doi: https://doi.org/10.1109/PST.2012. 6297930.
  59. T. Pifferi. (2018). How to efficiently store and query time-series data. Available: https://medium.com/@neslinesli93/how-to-efficiently-store-and-query-time-series-data-90313ff0ec20.
  60. M. Freedman. (2018). Timescaledb vs. influxdb: Purpose built differently for time-series data. Available: https://blog.timescale.com/timescaledb-vs-influxdb-36489299877.
  61. L. Hampton. (2018). Eye or the tiger: Benchmarking cassandra vs. timescaledb for time-series data. Available: https://blog.timescale.com/timescaledb-vs-cassandra-7c2cc50a89ce.
  62. R. Kiefer. (2017). Timescaledb vs. postgres for time-series: 20x higher inserts, 2000x faster deletes, 1.2x-14,000x faster queries. Available: https://blog.timescale.com/timescaledb-vspostgresql-6a696248104e.
  63. E. Nordström. (2017). Problems with postgresql 10 for time-series data. Available: https://blog.timescale.com/time-series-data-postgresql-10-vs-timescaledb-816ee808bac5.
  64. Z Mathe, C Haen, and F Stagni, “Monitoring performance of a highly distributed and complex computing infrastructure in lhcb”, in Journal of Physics: Conference Series, IOP Publishing, vol. 898, 2017, p. 092028.
  65. P. Seshadri, M. Livny, and R. Ramakrishnan, “The design and implementation of a sequence database system”, in Proceedings of the 22th International Conference on Very Large Data Bases, ser. VLDB ’96, San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1996, pp. 99–110, isbn: 1-55860-382-4. Available: http://dl.acm.org/citation.cfm?id=645922.673634.
  66. A. Lerner and D. Shasha, “Aquery: Query language for ordered data, optimization techniques, and experiments”, in Proceedings of the 29th International Conference on Very Large Data Bases Volume 29, ser. VLDB ’03, Berlin, Germany: VLDB Endowment, 2003, pp. 345–356, isbn: 0-12-722442-4. Available: http://dl.acm.org/citation.cfm?id=1315451.1315482.
  67. P. O’Neil, E. Cheng, D. Gawlick, and E. O’Neil, “The log-structured merge-tree (lsm-tree)”, Acta Informatica, vol. 33, no. 4, pp. 351–385, 1996.
  68. B. Momjian. (2018). Mvcc unmasked. Available: https://momjian.us/main/writings/pgsql/mvcc.pdf.
  69. P. Keil, “Principal agent theory and its application to analyze outsourcing of software development”, in Proceedings of the Seventh International Workshop on Economics-driven Software Engineering Research, ser. EDSER ’05, St. Louis, Missouri: ACM, 2005, pp. 1–5, isbn: 1-59593-118-X. doi: https://doi.org/10.1145/1082983.1083094.
  70. D. Bhagwat, L. Chiticariu, W.-C. Tan, and G. Vijayvargiya, “An annotation management system for relational databases”, in Proceedings of the Thirtieth International Conference on Very Large Data Bases Volume 30, ser. VLDB ’04, Toronto, Canada: VLDB Endowment, 2004, pp. 900–911, isbn: 0-12-088469-0. Available: http://dl.acm.org/citation.cfm?id=1316689.1316767.
  71. M. Y. Eltabakh, W. G. Aref, A. K. Elmagarmid, M. Ouzzani, and Y. N. Silva, “Supporting annotations on relations”, in Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, ser. EDBT ’09, Saint Petersburg, Russia: ACM, 2009, pp. 379–390, isbn: 978-1-60558-422-5. doi: https://doi.org/10.1145/1516360.1516405.
  72. F. H. da Silva, “Serial annotator: Managing annotations of time series.”, 2014.
  73. A. Bhardwaj, S. Bhattacherjee, A. Chavan, A. Deshpande, A. J. Elmore, S. Madden, and A. G. Parameswaran, “Datahub: Collaborative data science & dataset version management at scale”, arXiv preprint arXiv:1409.0798, 2014. 119
  74. V. Mihalcea. (2017). How does mvcc (multi-version concurrency control) work. Available: https://vladmihalcea.com/how-does-mvcc-multi-version-concurrency-control-work/.
  75. G. Ozsoyoglu and R. T. Snodgrass, “Temporal and real-time databases: A survey”, IEEE Transactions on Knowledge and Data Engineering, vol. 7, no. 4, pp. 513–532, 1995, issn: 1041-4347. doi: https://doi.org/10.1109/69.404027.
  76. E. Sciore, “Using annotations to support multiple kinds of versioning in an object-oriented database system”, ACM Trans. Database Syst., vol. 16, no. 3, pp. 417–438, Sep. 1991, issn: 0362-5915. doi: https://doi.org/10.1145/111197.111205.
  77. R. T. Snodgrass, “Temporal databases”, in Theories and methods of spatio-temporal reasoning in geographic space, Springer, 1992, pp. 22–64.
  78. S. Bhattacherjee, A. Chavan, S. Huang, A. Deshpande, and A. G. Parameswaran, “Principles of dataset versioning: Exploring the recreation/storage tradeoff”, CoRR, vol. abs/1505.05211, 2015. arXiv: 1505.05211. Available: http://arxiv.org/abs/1505.05211.
  79. H. Fujita, K. Iskra, P. Balaji, and A. A. Chien, “Versioning architectures for local and global memory”, in 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS), 2015, pp. 515–524. doi: https://doi.org/10.1109/ICPADS.2015.71.
  80. M. Fowler, “Event sourcing”, Online, Dec, p. 18, 2005.
  81. L. Halilaj, I. Grangel-González, G. Coskun, and S. Auer, “Git4voc: Git-based versioning for collaborative vocabulary development”, in 2016 IEEE Tenth International Conference on Semantic Computing (ICSC), 2016, pp. 285–292. doi: https://doi.org/10.1109/ICSC.2016.44.
  82. P. Lundgren. (2013). On git’s shortcomings. Available: http://www.peterlundgren.com/blog/on-gits-shortcomings/.
  83. S. Huang, L. Xu, J. Liu, A. J. Elmore, and A. G. Parameswaran, “Orpheusdb: Bolt-on versioning for relational databases”, PVLDB, vol. 10, no. 10, pp. 1130–1141, 2017. Available: http://www.vldb.org/pvldb/vol10/p1130-huang.pdf.
  84. R. Diana. (2011). Is object serialization evil?. Available: http://regulargeek.com/2011/07/06/is-object-serialization-evil/.
  85. D. Crockford, “The application/json media type for javascript object notation (json)”, 2006.
  86. T. Bray, J. Paoli, C. M. Sperberg-McQueen, E. Maler, and F. Yergeau, “Extensible markup language (xml).”, World Wide Web Journal, vol. 2, no. 4, pp. 27–66, 1997.
  87. K. Varda, “Protocol buffers: Google’s data interchange format”, Google Open Source Blog, Available at least as early as Jul, vol. 72, 2008.
  88. A. Sumaray and S. K. Makki, “A comparison of data serialization formats for optimal efficiency on a mobile platform”, in Proceedings of the 6th international conference on ubiquitous information management and communication, ACM, 2012, p. 48.
  89. R. Fielding, “Representational state transfer”, Architectural Styles and the Design of Netowork-based Software Architecture, pp. 76–85, 2000.
  90. M. Adnan, M. Just, and L. Baillie, “Investigating time series visualisations to improve the user experience”, in Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, ser. CHI ’16, San Jose, California, USA: ACM, 2016, pp. 5444–5455, isbn: 978-1-4503-3362-7. doi: https://doi.org/10.1145/2858036.2858300.
  91. W. S. Cleveland and R. McGill, “Graphical perception: Theory, experimentation, and application to the development of graphical methods”, Journal of the American Statistical Association, vol. 79, no. 387, pp. 531–554, 1984. doi: https://doi.org/10.1080/01621459.1984.10478080.
  92. N. R. Tague, The quality toolbox. Asq Press, 2005. 120
  93. J. Fuchs, F. Fischer, F. Mansmann, E. Bertini, and P. Isenberg, “Evaluation of alternative glyph designs for time series data in a small multiple setting”, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ser. CHI ’13, Paris, France: ACM, 2013, pp. 3237–3246, isbn: 978-1-4503-1899-0. doi: https://doi.org/10.1145/2470654.2466443.
  94. E. Tufte, The visual display of quantitative information, 1983.
  95. T. Saito, H. N. Miyamura, M. Yamamoto, H. Saito, Y. Hoshiya, and T. Kaseda, “Two-tone pseudo coloring: Compact visualization for one-dimensional data”, in IEEE Symposium on Information Visualization, 2005. INFOVIS 2005., 2005, pp. 173–180. doi: https://doi.org/10.1109/INFVIS.2005.1532144.
  96. M. Wattenberg, “Arc diagrams: Visualizing structure in strings”, in IEEE Symposium on Information Visualization, 2002. INFOVIS 2002., 2002, pp. 110–116. doi: https://doi.org/10.1109/INFVIS.2002.1173155.
  97. E. M. McCreight, “A space-economical suffix tree construction algorithm”, Journal of the ACM (JACM), vol. 23, no. 2, pp. 262–272, 1976.
  98. P. Bille, “A survey on tree edit distance and related problems”, Theoretical Computer Science, vol. 337, no. 1, pp. 217 –239, 2005, issn: 0304-3975. doi: https://doi.org/10.1016/j.tcs.2004.12.030 . Available: http://www.sciencedirect.com/science/article/pii/S0304397505000174.
  99. J. J. V. Wijk and E. R. V. Selow, “Cluster and calendar based visualization of time series data”, in Information Visualization, 1999. (Info Vis ’99) Proceedings. 1999 IEEE Symposium on, 1999, pp. 4–9, 140. doi: https://doi.org/10.1109/INFVIS.1999.801851.
  100. C. Daassi, M. Dumas, M.-C. Fauvet, L. Nigay, and P.-C. Scholl, Visual exploration of temporal object databases, 2000.
  101. J. V. Carlis and J. A. Konstan, “Interactive visualization of serial periodic data”, in Proceedings of the 11th Annual ACM Symposium on User Interface Software and Technology, ser. UIST ’98, San Francisco, California, USA: ACM, 1998, pp. 29–38, isbn: 1-58113-034-1. doi: https://doi.org/10.1145/288392.288399.
  102. M. Weber, M. Alexa, and W. Müller, “Visualizing time-series on spirals”, in Proceedings of the IEEE Symposium on Information Visualization 2001 (INFOVIS’01), ser. INFOVIS ’01, Washington, DC, USA: IEEE Computer Society, 2001, pp. 7–, isbn: 0-7695-1342-5. Available: http://dl.acm.org/citation.cfm?id=580582.857719.
  103. C. Tominski and H. Schumann, “Enhanced interactive spiral display”, 2008.
  104. F. Bouali, S. Devaux, and G. Venturini, “Visual mining of time series using a tubular visualization”, The Visual Computer, vol. 32, no. 1, pp. 15–30, 2016, issn: 1432-2315. doi: https://doi.org/10.1007/s00371-014-1052-0.
  105. K. Mitchell and J. Kennedy, “The perspective tunnel: An inside view on smoothly integrating detail and context”, in Visualization in scientific computing ’97: proceedings of the Eurographics Workshop, ser. Springer Computing Science, Springer, 1997.
  106. M. Suntinger, H. Obweger, J. Schiefer, and M. E. Gröller, “Event tunnel: Exploring event-driven business processes”, IEEE Comput. Graph. Appl., vol. 28, no. 5, pp. 46–55, 2008, issn: 0272-1716. doi: https://doi.org/10.1109/MCG.2008.97.
  107. M. Ankerst, “Visual data mining with pixel-oriented visualization techniques”, in Proceedings of the ACM SIGKDD Workshop on Visual Data Mining, 2001.
  108. P. Grunwald, “A tutorial introduction to the minimum description length principle”, arXiv preprint math/0406077, 2004.
  109. F. Wanner, W. Jentner, T. Schreck, A. Stoffel, L. Sharalieva, and D. A. Keim, “Integrated visual analysis of patterns in time series and text data workflow and application to financial data analysis”, Information Visualization, vol. 15, no. 1, pp. 75–90, 2016. doi: https://doi.org/10.1177/1473871615576925.
  110. E. Keogh, H. Hochheiser, and B. Shneiderman, “An augmented visual query mechanism for finding patterns in time series data”, Springer-Verlag, 2002, pp. 240–250.
  111. H. Hochheiser and B. Shneiderman, “Interactive exploration of time series data”, in Discovery Science, K. P. Jantke and A. Shinohara, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2001, pp. 441–446, isbn: 978-3-540-45650-6.
  112. S.-H. Bae, J. Y. Choi, J. Qiu, and G. C. Fox, “Dimension reduction and visualization of large highdimensional data via interpolation”, in Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing, ser. HPDC ’10, Chicago, Illinois: ACM, 2010, pp. 203–214, isbn: 978-1-60558-942-8. doi: https://doi.org/10.1145/1851476.1851501.
  113. J. Y. Choi, S. H. Bae, X. Qiu, and G. Fox, “High performance dimension reduction and visualization for large high-dimensional data analysis”, in 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, 2010, pp. 331–340. doi: https://doi.org/10.1109/CCGRID.2010.104.
  114. J. Liang and M. L. Huang, “Highlighting in information visualization: A survey”, in 2010 14th International Conference Information Visualisation, 2010, pp. 79–85. doi: https://doi.org/10.1109/IV.2010.21.
  115. J P. Vermylen, “Visualizing energy data using web-based applications”, Dec. 2008.
  116. D. Winokur. (2011). Flash to focus on pc browsing and mobile apps; adobe to more aggressively contribute to html5. Available: https://blogs.adobe.com/conversations/2011/11/flashfocus.html.
  117. J. R. Harger and P. J. Crossno, “Comparison of open-source visual analytics toolkits”, vol. 8294, 2012, pp. 8294 –8294 –10. doi: https://doi.org/10.1117/12.911901.
  118. U. NIST, Descriptions of sha-256, sha-384 and sha-512, 2001.
  119. N. Provos and D. Mazieres, “A future-adaptable password scheme.”, 1999.
  120. D Eastlake and T Hansen, “Rfc 6234: Us secure hash algorithms (sha and sha-based hmac and hkdf)”, IETF Std, 2011.
  121. J Jonsson and B Kaliski, “Public-key cryptography standards (pkcs)# 1: Rsa cryptography, specifications version 2.1., 2003”, RFC3447, vol. 5, p. 14, 2004.
  122. J. Schaad, “Use of the rsassa-pss signature algorithm in cryptographic message syntax (cms)”, Tech. Rep., 2005.
  123. P. Hoffman, “Elliptic curve digital signature algorithm (dsa) for dnssec”, 2012.

Other posts in the publications section