Last week I attended the annual EarthCube All Hands Meeting (ECAHM) in Alexandria, Washington. Since it’s been a while since I last wrote my last post, I think it would be interesting to share my notes and highlights here for anyone who missed the event.
ECAHM meetings are usually very enriching experiences, as they bring together a variety of researchers from different fields related to geosciences, ranging from computer scientists to volcanologists or marine biologists. The purpose of the meeting is to gather the community together and hear everyone report back from their EarthCube NSF funded projects, which are targeted towards improving cyber-infrastructure in the geosciences. As a computer scientist, I think this is a great meeting to attend for two main reasons: first, you always learn something new, even if it’s not in your domain. Second, people are extremely grateful to your contributions, as you are helping them become more effective when doing their science.
So, what was I doing at ECAHM 2018?
I attended the meeting to present our latest progress in OntoSoft, a distributed software metadata registry we created at ISI to facilitate scientists describe their software. You can see the poster abstract online (and soon the poster itself). I also participated on a “speed-dating session”, where I got to discuss for half an hour how to describe software with a domain scientist; and I substituted Yolanda Gil in a panel for external partnership opportunities, where I presented the Open Knowledge Network initiative. This effort, led by NITRD, is a great opportunity of creating a shared open knowledge graph that would be used for both research and industry to refine and curate its contents. The idea is that this knowledge graph becomes part of the US infrastructure the same way supercomputers currently are, so anyone could benefit from it and also contribute to it. It looks like the NSF is keen to pursue this objective too.
Two colleagues of mine also presented other initiatives I am involved in. Deborah Khider showcased our efforts towards structuring metadata and creating standards in the paleoclimate sciences, together with a set of tools that a team of paleo-climate scientists have developed to work with that structured data. She also managed to mix Star Wars and Star Trek themes in her poster and presentation, which was well received by the attendants (I think everyone stopped at her poster)
As expected, keynotes at ECAHM are nothing like venues such as AAAI or IUI. The first speaker was Dean Pesnell (NASA) and he presented the research carried out by his team on studying the sun and sun spots. Why is this related to geosciences? Because the sun could be considered “our ground truth for the universe”, and anything related to its activity has many implications in any of the fields of geosciences. Their main problem is how to analyze the amount of data that they have. Each of their datasets may contain several hundred million images, so proper metadata is crucial (you don’t want to find out you have downloaded 300 million images for nothing). Dean showed some impressive videos of their observations of the sun, as well as their pipelines to handle “very big data” analyses.
The second speaker was Sarah Stamps, and she talked about continental rift and the Tanzania Volcano observatory. Apparently, geologists are one of the few people in the word who would run towards an erupting volcano, instead of away from it. Sarah described the EARS system (East African Rift System) they are setting up, and how they teamed up with CHORDS to enable real time analysis of the observations they measure on the field. Thanks to her work, they are developing an early warning system for hazard detection! Sarah was departing soon to set a few more observing stations in the field, so best of luck!!
The third speaker was Caroline S. Wagner, who gave some metrics on the social side of interdisciplinary collaboration across disciplines. Science has become increasingly collaborative and team based, and the number of international collaborations have doubled in the past years. The number of countries producing 95% of research has gone from 7 to 15, which indicates we are moving in the right direction. However, more than 50% of the articles are currently never cited. A few takeaways from this talk are: 1) International collaborations start face to face, so go to different events and meet new people; 2) Diverse teams usually take longer to be productive, as people don’t usually speak the same language. Be patient!!; 3) Work towards a solution, not towards interdisciplinar teams. Interdisciplinarity should be the means to an end, not the end itself.
Below are some additional highlights I found interesting for the EarthCube community.
Eva Zanzerika reported on the NSF 10 Big Ideas, which nicely summarize the interests of the agency in terms of funding in the next years. The report has been out since more than 1 year ago, but it’s never too late to catch up!
Doug Fils presented their plan for turning P418 turning into something bigger. In case you don’t know, P418 currently tracks the metadata of datasets exposed as schema.org and aggregates it in a search engine (a search engine for scientific data). Future plans are to ingest other types of resources and make the code base stable.
Interesting working lunch idea: A napkin drawing exercise. Do you know how to present your idea with a simple sketch?
The Association for the Advancement of Artificial Intelligence conference (AAAI) is held once a year to bring together experts from heterogeneous fields of AI and discuss their latest work. It is also a great venue if you are looking for a new job, as different companies and institutions often announce open positions. Last week, the 31st edition of the conference was celebrated in downtown San Francisco, and I attended the whole event. If you missed the conference and are curious about what was going on, make sure you read the rest of this post.
But first: what was I doing there?
I attended the conference to co-present a tutorial and a poster.
The poster I presented described the latest additions of the DISK framework. In a nutshell, we have adapted our system for automating hypothesis analysis and revision to operate on data that is constantly growing. While doing this, we keep a detailed record of the inputs, outputs and workflows needed to do the revision of the hypothesis. Check out our paper for details!
Ok, enough self-promotion! Let’s get started with the conference:
In general, the quality of the keynotes and talks was outstanding. The presenters did a great job and effort to talk about their topics without jumping into the details of their field.
Rosalind Piccard started the week by talking about AI and emotions, or, using her own terms, “affective computing”. Detecting the emotion of the person interacting with the system is pivotal for decision making. But recognizing these emotions is not trivial (e.g., many people smile when they are frustrated, or even angry). It’s impressive how sometimes just training neural networks with sample data is not enough, as the history of the gestures play an important role in the detection as well. Rosalind described her work for detecting and predict emotions like the interest of an audience or stress. Thanks to a smart wristband they are able to predict seizures and breakouts in autistic kids. In the future, they aim to be able to predict your mood and possible depressions!
On Tuesday, the morning keynote was given by Steve Young, who talked about speech recognition and human-bots interaction. Their approach is mostly based on neural networks and reinforced learning. Curiously enough, this approach works better on the field (with real users) than with simulated results (for which other approaches work better). The challenges in this area lie in determining when a dialog is not accurate, as users tend to lie a lot when providing feedback. In fact, maybe the only way of knowing that something went wrong in a dialog is when it’s too late and the dialog has failed. As a person working on the Semantic Web domain, I found interesting that knowledge bases are an uncharted territory in this field at the moment.
Jeremy Frank spoke in the afternoon session for IAAI. He focused on the role of AI on autonomous space missions where sometimes the communications are interrupted and many anomalies may occur. The challenge in this case is not only to be able to plan what the robot or ship are going to do, but to monitor the plan and explain whether an order or a command did what it was actually supposed to. In this scenario, having new software becomes a risk.
On Wednesday, Dmitri Dolgov was in charge of talking about self-driving cars. More than 10 trillion miles are travelled every year across the world, with over 1.2 million casualties in accidents that are 94% of the time a human error. The speaker gave a great overview of the evolution of the field, starting in 2009 when they wanted to understand the problem and created a series of challenges to drive 100 miles in different scenarios. By 2010, they had developed a system good enough for driving a blind man across town, automatically. In 2012, the system was robust enough to drive in freeways. By 2015, they had finally achieved their goal: a complete driverless vehicle, without steering wheel or pedals. A capability of the system that surprised me is that it is able to read and mimic human behavior in intersections or stop signs without any trouble. In order to do this, the sensor data has to be very accurate, so they ended up creating their own sensors and hardware. As in the other talks, deep learning techniques have helped enormously to recognize certain scenarios and operate accordingly. Having the sensor data available has also helped. These cars have more than 1 billion virtual miles of training, and they are failing less and less as time goes by.
The afternoon session was led by Kristen Grauman, an expert in computer vision who analyzed how image recognition works in unlabeled video. The key challenge in this case is to be able to learn from images in a more natural way, as animals do. It turns out that our movement is heavily correlated to our vision sense, to the point that if we don’t allow an animal to move freely when it’s growing up and viewing the world, it may be damaged permanently. Therefore, maybe machines should learn from images in movement (videos) to understand better the context of an image. The first results in this direction look promising, and the system has so far learned to track relevant moving objects in video, by itself.
The final day opened with a panel that I am going to include in the keynote group, as it has been one of the breakthroughs of this year. An AI has recently beaten all the professional players against whom it has played in Poker (one to one), and two of the lead researchers in the field (Michael Bowling and Tuomas Sandholm) were invited to show us how they did it. Michael started describing DeepStack and why Poker is a particularly interesting challenge for AI: while in other games like chess you have all the information you need at a given state to decide your next move, Poker is an imperfect information game. You may have to remember the history of what has been done in order to proceed with your next decision. This creates a decision tree that is even bigger than complex board games like Chess and Go, so researchers have to abstract and explore the sparse tree. The problem is that, at some point, something may have happened that wasn’t taken into account in the abstraction, and this is where the problems start.
Their approach for addressing this issue is to reason over the possible cards that the opponent thinks the system has (game theory and Nash equilibrium play a crucial role). The previous history determines distributions of the cards, while evaluation functions have different heuristics based on the beliefs of the players in the current game (deep learning is used to choose the winning situation out of the possibilities). While current strategies are very exploitable, DeepStack is one of the least, being able to make 8 times what a regular player makes while being able to run in a laptop during the competition (the training part takes place before).
Tuomas followed introducing Libratus, an AI created last year but evolved from previous efforts. Libratus shares some strategies with DeepStack (card abstraction, etc.), as the Poker community has worked together on interoperable solutions. Libratus is the AI that actually played against the Poker professionals and beat them, even when they had a 200K $ incentive for the winner. The speaker mentioned that instead of trying to exploit the weaknesses of the opponent, Libratus focused on how the opponent exploits the strategies used by the AI. This way, Libratus could learn and fix these holes.
According to the follow up discussion, Libratus could probably defeat Deepstack, but they haven’t played against each other yet. The next challenges are applying these algorithms to solve similar issues in other domains, and making an AI that can actually be part of a table and join tournaments (this may imply a redefinition of the problem). Both researchers ended up stating how supportive the community has been providing feedback and useful ideas to improve their respective AIs.
The last keynote speaker was Russ Tedrake (MIT Robot labs), who presented advances in robotics and the lessons learned during the three year DARPA challenge on robotics. The challenge had a series of heterogeneous tasks (driving, opening a valve, cut a hole in a wall, open and traverse a door, etc.). Most of these problems are faced as optimization problems, and planning is a key feature that has to be updated on the go. Robustness is crucial for all the processes. For example, in the challenge, the MIT robot failed due to a human error and an arm broke off. However, thanks to the redundancy functions, the robot could finish the rest of the competition using only the other arm. As a side note, the speaker also explained why the robots always “walk funny”: their center of mass. It facilitates the equations for movement, so researchers have adopted it to avoid more complexity in their solutions.
One of the main challenges for these robots is perception. It has to run constantly to understand the surroundings of the robot (e.g., obstacles), dealing with possible noise data or incomplete information. The problem is that, when a new robot has to be trained, most of the data produced with other robots is not usable (different sensors, different means for grabbing and dealing with objects, etc.). Looking how babies react with their environment (touching everything and tasting it) might bring new insights in how to address these problems.
-The “AI in practice” session that occurred on Sunday was great. The room was packed, and we saw presentations from companies like IBM, LinkedIn or Google.
I liked these talks because they highlighted some of the current challenges faced by AI. For example, Michael Witbrock (IBM) described how despite the advances in Machine Learning applications, the representations used to address a problem can barely be reused. The lack of explanation of deep learning techniques does not help either, specifically in diagnosing diseases: doctors want to know why a certain conclusion is reached. IBM is working towards improving the inference stack, so as to be able to combine symbolic systems with non-symbolic ones.
Another example was Gary Marcus (Uber labs), who explained that although there has been a lot of progress on AI, AGI (artificial general intelligence) has not advanced that much. Perception is more than being able to generalize from a situation, and machines are currently not very good at it. For example, an algorithm may be able to detect that there is a dog in a picture, and that the dog is lifting weights, but it won’t be able to tell you why this picture is unique or rare. The problem with current approaches is that they are incremental. Sometimes, there is a fear to step back and look at how some of our current problems are addressed. Focusing too much on incremental science (i.e., improving a small percentage of the precision of the current algorithms), may lead to get stuck in local maximums. Sometimes we need to address problems from different angles to make sure we make progress.
– AI in games is a thing! Over the years I have seen some approaches that aim to develop smart players, but attending this tutorial was one of the best experiences in the conference. Julian Toeglius gave an excellent overview/tutorial of the state of the art in the field, including how a simple A* algorithm may almost be a perfect player for Mario (if we omit those levels when we need to go back), how games are starting to adapt to players, how to build credible non player characters and how to create scenarios that are fun to play automatically. Then he introduced other problems that overlap with many of the challenges addressed in the keynotes: 1) How can we produce a general AI that learns how to play any game? And 2) how can we create a game automatically? For the first one, I found interesting that they have already developed a benchmark of simple games that will test your approach. The second one however is deeper, as the problem is not creating a game, or even a valid game. The real problem in my opinion is creating a game that a player considers fun. At the moment the current advances consist on modifications of existing games. I’ll be looking forward to reading more about this field and its future achievements.
– AI in education: Teaching ethics to researchers is becoming more and more necessary, given the pace at which science evolves. At the moment, this is an area often overlooked in any PhD or research program.
– The current NSF research plan is not mute! Lynne Parker introduced the creation of the AI research and development strategic plan, which expects to remain untouched even after the results of the latest election. The current focus is on how AI could help to the national priorities: liberty (e.g., security), life (education, medicine, law enforcement, personal services, etc.) and pursuit of happiness (manufacturing, logistics, agriculture, marketing, etc.). Knowledge discovery and transparent and explainable methods will help for this purpose.
– Games night! Great opportunity to socialize and meet part of the community by drawing, playing puzzles and board games.
– Many institutions are hiring. The Job fair had plenty of participating companies and institutions, but it was a little bit far away from the main events and I didn’t see many people attending. In any case, there were also plenty of companies with stands while the main conference was happening as well, which made it easy to talk to them and see what were they working on.
– Avoid reinventing the wheel! There was a cool panel on Expert systems history. Sometimes it is good to just take a step back and see how they analyzed research problems in the past. Some of their solutions still apply today
– Check out the program for more details on the talks and presentations.
Attending AAAI has been a great learning experience. I really recommend it to anyone working on any field of AI, especially if you are student or you are looking for a job. I also find very exciting that some of the problems I am working on are also identified as important by the rest of the community. In particular, the need of creating proper abstractions to facilitate understanding and shareability of current methods was part of the main topic of my thesis, while the need for explanation of the result of a certain technique is applied is highly related to what we do for capturing the provenance of scientific workflow results. As described by some of the speakers, “Debugging is a kind of alchemy” at the moment. Let’s turn it into a science.
A few months ago my supervisor told me about the opportunity to join a group of geologists in a field trip to Yosemite. The initiative was driven by the Earthcube community, in an effort to join together experts from different geological domains (tectonics, geochemistry, etc.) and computer scientists. I immediately applied for a place in the trip, and I have just returned back to Spain. It has been an amazing experience, so I want to summarize in this post my views and experiences during the whole week.
Travelling and people
For someone travelling from Europe, the trip was exhausting (2 scales and up to 24 hours of flights + waiting), but I really think it was worth it. I have learnt a lot from the group and the challenges geologists are facing when collecting and sharing their data, samples and methods. All participants were open and had the patience to explain any doubts or concerns on the geological terms being used in the exercises and talks. Also, all the attendants were highly motivated and enthusiastic to learn new technologies and methods that could help them to solve some of their current issues. I think this was crucial for creating the positive environment of discussion and collaboration we got during the whole experience. I hope this trip helps pushing forward best practices and recommendations for the community.
Yosemite National Park
There is little I can say about the park and its surroundings that hasn’t been already told. Therefore, I’ll let the pictures speak for themselves:
What was the rationale behind the trip?
As I said before, the purpose of the fieldtrip was to bring together computer scientists and geologists. The main reason why this could be interesting for geologists is twofold: first, the geologists could show and tell computer scientists how they work and their current struggle with either hardware or software on the field. The second reason is that geologists could connect to other geologists (or computer scientists) in order to foster future collaborations.
From a computer science point of view, I believe this kind of trip is beneficial to raise awareness of current technologies to end users (in many cases we have the technology but we don’t have the users to use it). Also, it always helps seeing by one’s eyes what are the real issues faced by scientists on a particular domain. It makes them easier to understand.
What was I doing there?
Nobody would believe me when I told them that I was going to travel to Yosemite with geologists to do some “field” work. And, to be honest, one of my main concerns preparing the trip was that I had no idea on how I would make myself useful for the rest of the attendants. I felt like I would learn a lot from all the other people, since some of their problems are likely to be similar to other problems in other areas, and I wanted to give something in return. Therefore I talked to everyone and asked a lot of questions. I also gave a 10 minute introductory talk on the Semantic Web (available here), to help them understand the main concepts they had already heard in other talks or project proposals. Finally, I came up with a list of challenges they have from the computational perspective and proposed extending existing standards to address some of them.
Challenges for geologists
I think it is worth describing here some of the main challenges that these scientists are facing when collecting, accessing, sharing and reusing data:
Sample archival and description: there is no standard way of processing and archiving the metadata related to samples. Sometimes it is very difficult to find the metadata associated to a sample, and a sample with no metadata is worthless. Similarly, it is not trivial to find the samples that were used in a paper. NSF is now demanding a Data Management Plan, but what about the Sample Management Plan? Currently, every scientist is responsible for his/her samples, and some of those might be very expensive to collect (e.g., a sample from an expedition to Mount Everest). If someone retires or changes institutions, the samples are usually lost. Someone told me that the samples used in his work could be found in his parent’s garden, as he didn’t have space for them anymore (at least those could be found 🙂 ).
Repository heterogeinity and redundancy. Some repositories have started collecting sample data (e.g., SESAR), which shows an effort from the community to address the previous issue. Every sample is given a unique identifier, but it is very difficult to determine if a sample already exists on the database (or other repositories). Similarly, there are currently no applications that allow exploiting the data of the repository. Domain experts perform SQL queries, which will be different for each repository as well. This makes integrating data from different repositories difficult at the moment.
Licensing: People are not sure about the license which that have to attach to their data. This is key for being attributed correctly when someone reuses your results. I have seen this issue in other areas as well. In this link I think they explain everything with high detail: http://creativecommons.org/choose/.
Sharing and reusing data: Currently if someone wants to reuse some other researcher’s mapping data (i.e., those geological observations they have written down in a map), they would have to contact the authors and ask them for a copy of their original field book. With luck, there will be a scanned copy or a digitized map, which then will have to be compared (manually) to the observations performed by the researcher. There are no approaches for performing such comparison automatically.
Trust: Data from other researchers is often trusted, as there are no means to check whether the observations performed by a scientist are true or not unless one goes into the field.
Sharing methods: I was surprised to hear that the mean reason why the methods and workflows followed on an experiment are not shared is because there is no culture for doing it. Apparently the workflows are there because some people use them as a set of instructions for students, but they are not documented in the scientific publications. This is an issue for the reproducibility of the results. Note that here we define workflow as the set of computational steps that are necessary to produce a research output on a paper. Geologists have also manual workflows for collecting observations on the field. These are described on their notebooks.
Reliability: This was brought up by many scientists on the field. Many still think that the applications on their phones are often not reliable. In fact we did some experiments with an Iphone and Ipad and you could see differences in their measures due to their sensors. Furthermore, I was told that if a rock is magnetic, they become useless. Most of the scientists still rely on their compasses to perform their measurements.
Why should geologists share their data?
The vans haven’t been just a vehicle to take us to some beautiful places in this trip; they have been a useful means to get people to discuss some of the challenges and issues described above. In particular, I would like to recall the conversation we had one of the last days between Snir, Zach, Basil, Andreas, Cliff and others. After discussing some of the benefits that sharing has to other researchers, Andreas asked about the direct benefit he would obtain for sharing his data. This is crucial in my view, as if sharing data is only going to have benefits for other people and not me, why should I do it? (unless I get funding for it). Below you can find the arguments in favor of doing this practice as a community, tied with some of the potential benefits. (Quoting Cliff Joslyn in points 1 and 2)
Meta-analysis: or being able to reuse other researcher’s data to analyze and compare new features. This is also beneficial for one’s own research, in case you change your laptop/institution and no longer have access to your previous data.
Using consumer communities to help curating data: apparently, some geophysicists would love to reuse the data produced by geologists. They could be considered as clients and taken into account for applying into a grant in a collaboration.
Credit and attribution: Recently some journals like PLOS or Elsevier have started creating data journals. In there you would just upload your dataset as a publication, so people using it can cite it. Additionally, there are data repositories like FigShare, where just by uploading a file you make it citable. This way someone could cite an intermediate result you obtained during part of your experiments!
Reproducibility: sharing data and methods is a clear sign of transparency. By accessing the data and methods used in a paper, a reviewer would be able to check the intermediate and final results of a paper in order to see if the conclusions hold.
Are these benefits enough to convince geologists to share and annotate their data? In my opinion, the amount of time that one has to spend documenting work is still a barrier for many scientists. The benefits cannot be seen instantly, and in most of the cases people don’t bother after writing the paper. It is an effort that a whole community has to undertake, and make it part of its culture. Obviously, automatic metadata recording will always help.
This trip has demonstrated to be very useful to join together people from different communities. Now, how do we move forward? (again, I do some quoting from Cliff Joslyn, who summarized some of the points discussed during the week):
Identify motivated people who are willing to contribute with their data.
Creation of a community database.
Agree on standards to use as a community, using common vocabularies to relate the main concepts on each domain.
Analyze whether there are already existing valuable efforts already developed instead of starting from scratch.
Contact computer scientists, ontologists and user interface experts to create a model that is both understandable and easy to consume from.
Exploit the community database. Simple visualization in maps is often useful to compare and get an idea of mapped areas.
Collaborate with computer scientists instead of considering them as merely servants. Computer scientist are interested in challenging real world problems, but they have to be in the loop.
Finally, I would like to thank Matty Mookerjee, Basil Tikoff and all the rest of the people who made this trip possible. I hope it happens again next year. And special thanks to Lisa, our cook. All the food was amazing!
Below I attach a summary of the main activities of the trip by days, in case someone is interested on attending future excursions. Apologies in advance on the incorrect usage of geological terms.
Summary of the trip
Day1: after a short introduction on how to configure your notebook (your convention, narrative, location, legend, etc.) we learnt how to identify the rock we had in front of us by using the hand lens. Rocks can be igneous, metamorphic and sedimentary, and in this case, as can be seen in the pictures below, we were in front of the igneous type. In particular, granite.
Once you know the type of rock you are dealing with and its location, it’s time to sketch, leaving the details and representing just those that are relevant for your observation. Note that different type of geologists might consider relevant different features. Another interesting detail is that observations are always associated with areas, not points, because of a possible error. This might sound trivial but adds a huge difference (and more complexity) when representing the information as a computer scientist.
The day ended with three short talks: one about the Strabo app for easily handling and mapping your data with your phone, and the Fieldmove app (Andrew Bladon) for easily measuring strike and dip, adding annotations and representing them in a map. Shawn Ross wrapped up the session by talking briefly about his collaborations with archaeologists for field data collection.
Day2: We learnt about cross sections in Sierra Nevada, after a short explanation on the evolution of the area from a geological perspective. Apparently geologist think in time when analyzing a landscape, in order to determine which were the main changes that were necessary to produce the current result. In this regard, it is like learning about the provenance of the earth, which I think it is pretty cool.
Unfortunately, Matty’s favorite section was not accessible and had to be explained via a poster. Some flooding had destroyed the road and was too far away to be reached by foot. Therefore we were driven to another place in the Sierra where we were asked to draw a cross section ourselves (with the help of a geologist). It was an area with very clear faults, and most of us drew their direction right. The excursion ended when one of the geologist gave a detailed speech on the rationale behind her sketch, so we could compare.
When we arrived at the research center, Jim Bowing gave a short talk on state, and how geologists should be aware of their observations and the value of the attributes described on them. We as computer scientists can only recreate what we are given. We then divided in groups and thought about use cases, reporting two to the rest of the groups.
Day3: It was time to learn about the gear: GPS, tablet and laptop (which can be heavy). All equipped with long lasting batteries (could last more than 2 days of fieldwork). We went to the Deep Springs Valley, and after locating ourselves on a topological map we followed a contact (i.e., line between two geological units). We could experience some frustration with the devices (the screen was really hard to see) and we poured some acid on the rocks in order to determine whether they were carbonated or not.
The contact finished abruptly in a fault after a few hundred meters (represented as a “v” in a map). We determined its orientation and fault access, which was possible thanks to some of the mobile applications we were using on the field. If done by hand, we would have had to analyze our measurements at home.
After a brief stop on an observatory full of metamorphic rocks, we headed back to the research center. There, Cliff Joslyn and I gave a brief introduction to databases, relational models and the Semantic Web before doing another group activity. In this case, we tried to think about the perfect app for geologists, and what kind of metadata would it need to capture.
Day4: We went to the Caldera, close to a huge crack in the ground, where we learnt a bit more about of its formation. There was a volcanic eruption in two phases, which can be distinguished by the materials that are around the pomez stones.
We then went to the lakes, where we learnt from Matty on how to extract a sample. First you ought to identify properly the rock, annotate it with the appropriate measurements (orientation, strike, dip), label the rock and then extract it. If you use a sample repository like SESAR, you may also ask in advance for identifiers and print stickers for labeling the rock.
We ended the hike with a short presentation by Amanda Vizedom on ontologies and discussing about the future steps for the community.
Lately I’ve been asked to do several revisions in different workshops, conferences and journals. In this post I would like to share with you a generic template to follow when reviewing a scientific publication. If you have been doing it for a while you may find it trivial, but I think it might be useful for people that have started recently in the reviewing process. At least, when I started, I had to ask for a similar one to my advisor and colleagues.
But first, several reasons why you should review papers:
Helps you to identify whether a scientific work is good or not. And refine your criteria by comparing yourself with other reviewers. Also, it trains you to defend your opinion based on what you read.
Helps you refining your own work, by identifying common flaws that you normally don’t detect when writing your own papers.
It’s an opportunity to update your state of the art, or learn a little on other areas.
Allows you contributing to the scientific community, and getting public visibility.
A scientific work might be the result of months of work. Even if you think it is trivial you should be methodic explaining the reasons why you think it should be accepted or rejected (yes, even if you think the paper should be accepted). A review should not be just an “Accepted” or “Rejected” statement, but also contain valuable feedback for the authors. Below you can see the main guidelines for a good review:
Start your review with an executive summary of the paper: this will let the authors know the main message you have understood from their work. Don’t copy and paste the abstract; try to communicate the summary in your own words. Otherwise they’ll just think you didn’t put much attention in reading the paper.
Include a paragraph summarizing the following points:
Grammar: Is the paper well written?
Structure: is the paper easy to follow? Do you think the order should have been different?
Relevance: Is the paper relevant for the target conference/journal/workshop?
Novelty: Is the paper dealing with a novel topic?
Your decision. Do you think the work should be accepted for the target publication? (If you don’t, expand your concerns in the following paragraphs)
Major Concerns: Here is where you should say why do you disagree with the authors, and highlight your main issues. In general, a good research paper should describe successfully four main points:
What is the problem the authors are tackling? (Research hypothesis) This point is tricky, because sometimes it is really hard to find! And in some cases the authors omit it and you have to infer it. If you don’t see it, mention it in your review.
Why is this a problem? (Motivation). The authors could have invented a problem which had no motivation. A good research paper is often motivated by a real world problem, potentially with a user community behind benefiting from the outcome.
What is the solution? (Approach). The description of the solution adopted by the authors. This is generally easy to spot on any paper.
Why is it a good solution? (Evaluation). The validation of the research hypothesis described in point one. The evaluation is normally the key of the paper, and the reason why many research publications are rejected. As my supervisor has told me many times, one does not evaluate an algorithm or an approach; one has to evaluate whether such proposed algorithm or approach validate the research hypothesis.
When a paper has the previous four points well described, it is accepted (generally). Of course, not all papers enter the category of a research papers (like a survey paper or an analysis paper). But the four previous points should cover a wide range of publications.
Minor concerns: You can point out minor issues after the big ones have been dealt with. Not mandatory, but t will help the authors to polish their work.
Typos: unless there are too many, you should point the main typos you find in your review. Or the sentences you think are confusing.
Don’t be a jerk: many reviews are anonymous, and people tend to be crueler when they know their names won’t be shown to the authors. Instead of saying that something “is garbage”, state clearly why you disagree with the authors proposal and conclusions. Make the facts talk for themselves; not your bias or opinion.
Consider the target publication. You can’t use the same criteria for a workshop, conference or journal. Normally people tend to be more permissive at workshops, where the evaluation is not that important if the idea is good, but require a good paper for conferences and journals.
Highlight the positive parts of the authors’ work, if any. Normally there is a reason why the authors have spent time on the presented research, even if the idea is not very well implemented.
Check the links, prototypes, evaluation files and in general, all the supplementary material provided by the authors. A scientist should not only review the paper, but the research described on it.
Be constructive. If you disagree with the authors in one point, always mention how they could improve their work. Otherwise they won’t know how to handle your issue and ignore your review.
Last week I attended to the Provenance Week in Cologne. For the first time, IPAW and TAPP were celebrated together, even having some overlapping sessions like the poster lighting talks. The clear benefit of having both events at the same time is that a bigger part of the community was actually able to attend to the event, even if some argued that 5 full days of provenance is too long. I got to see many known faces, and finally meet some people who I had just talked to remotely.
In general, the event was very interesting, definitely worth paying a visit. I was able to gather an overview of the state of the art in provenance in many different domains, and how to protect it, collect it and exploit it for various purposes. Different sessions led to different discussions, but I liked 2 topics in particular:
The “Sexy” application for provenance (Paul Groth). After years of discussions we have a standard for provenance, and many applications are starting to use it and extending for representing provenance across different domains. But there is no application that uses provenance from different sources to do something meaningful for the final user. Some applications define metrics that are domain dependent to assess trust, others like PROV-O viz visualize it to see what is going on in the traces, and others try to use it to explain what kind of things we can find in a particular dataset. But we still don’t have the provenance killer app… will the community be able to find it before the next Provenance Week?
Provenance has been discussed for many years now. How come are we still so irrelevant? (Beth Plale). This was brought up by the keynote speaker and organizer Beth Plale, who talked about different consortiums in the U.S. that are starting to care about provenance (e.g., Hathitrust publisher or the Research Data Alliance). As some people pointed out, it is true that provenance has gathered a lot of importance in the recent years, up to the point at which some of the grants will only be provided if the researchers guarantee the tracking of provenance. The standard helps, but we are still far from solving the provenance related issues. Authors and researchers have to see the benefit from publishing provenance (e.g., attribution, with something like PROV-Pingback); otherwise it will be very difficult to convince them to do so.
Apart from the pointers I have included above, many other applications and systems were presented during the week. These are my highlights:
Reconstruction of provenance: Hazeline Asuncion and Tom de Nies both presented their approaches for finding the dependencies among data files when the provenance is lost. I find this very interesting because it could be used (potentially) to label workflow activities automatically (e.g., with our motif list).
Provenance capture: RData tracker, an intrusive, yet simple way of capturing provenance of scripts in R. Other approaches like no workflow also looked ok, but seemed a little heavier.
Provenance benchmarking: Hugo Firth presented ProvGen, and interesting approach for creating huge synthetic provenance graphs simulating real world properties (e.g., twitter data). All the new provenance datasets were added to the ProvBench Github page, now also in Datahub.
Provenance pingbacks: Tim Lebo and Tom de Nies presented two different implementations (see here and here) for the PROV Pingback mechanism defined in the W3C. Even though security might still be an issue, this is a simple mechanism to provide attribution to the authors. Fantastic first steps!
Provenance abstraction: Paolo Missier presented a way of simplifying provenance graphs while preserving the prov notation, which helps to understand better what is going on in the provenance trace. Roly Perrera presented an interesting survey on how abstraction is also being used to present different levels of privacy when accessing the data, which will be more and more important as provenance gains a bigger role.
Applications of provenance: One of my favorites was Trusted Tiny Things, which aimed at describing everyday things with provenance descriptions. This would be very useful to know, in a city, how much the government spent on a certain item (like statue), and who was responsible for buying it. Other interesting applications were Pinar Alper’s approach for labeling workflows, Jun Zhao’s approach for generating queries for exploring provenance datasets and Matthew Gamble’s metric for quantifying the influence of an article in another just by using provenance.
The Second Beyond the PDF workshop has finally taken place last week in Amsterdam (fortunately I got travel support from the organizers, so I was able to attend the full event). If I have to pick a word to describe the workshop, it would be “different”. As Paul Groth (one of the chairmans) summarizes in his post, the audience was heterogeneous: there were people from biomedical, humanities, social sciences and physical sciences domains, belonging to different types of organizations (ranging from academics to governmental). Publishers and editorials were also present, and many different tools, visions and ideas were presented to improve the future of scholarship communication. This whole context was a bit different to what one could be used to see in other conferences, where you find people doing similar things to what you do, and you discuss your research rather than the idea of how to communicate it to others. Here people were not afraid to tell publishers and editors why they thought the system was broken, exposing their arguments in a non-formal friendly environment.
Another interesting fact was the “second screen” showing the twitter wall live. People were very active, highlighting the interesting quotes from the talks and initiating debates in parallel to all the sessions. Even today the tag #btpdf2 is still active. Congrats to all the organizing staff!
Detailed summary and highlights
The program of the workshop is available here. Below you can see the summary and highlights from the different sessions and interesting quotes I wrote down in my notes.
The day started with a Keynote by Kathleen Fitzpatrick, who explained how the book is not dead, although the academic book is kind of dying. The blog could be a replacement, since it is a kind of alternative way to publish the resources. You are able to get comments from the community, feedback suggestions and support. Why couldn’t we be our own publishers?
The current reviewing process has concerns; could it be part of what is broken? Bias and flaws is not unusual, and reviewing requires a great labor for which we normally don’t receive much credit. As an example, she explained how the book she had been writing had more impact in a blog form than in its final published format.
Finally, she remarked how important the online communities are. If you build a tool or a service without a community, people will not just come. You have to build a community first. Some interesting quotes: “Publishers will have to focus more on services and less on selling digital objects”. “We need filters, not gatekeepers” (referring to publishers and editors). “The network is not a threat. It helps to reach more people” Laura Czerniewicz and Michelle Willmers followed the keynote with a session on context. They highlighted the dangers of a complete open access: will it become a flooding of content? There is a need for a rewarding system. What do authors get from open access? Editors are gatekeepers. Another important factor is that in the end only the Journal articles are considered when judging the validity of a researcher. Tweets, blogs, talks, workshops and conferences are ignored, even when they could have had more impact than the actual journals. In most cases journal articles are the peak of the iceberg.
Next, on the Vision session, Nathan Jenkins introduced Authorea, a very cool tool to build articles online without having to deal with the Latex compilation and built on Ruby on Rails. Mercé Crossas presented Dataverse, a portal for archiving data results for citation purposes, motivated by the volatility of the links in old papers. Amalia S. Levi explained how in historical research a lot of the data already existed, but the links were missing. (This reminds me of some conversations that I’ve had recently about how the papers are cited in the scientific community. It turns out that sometimes this is the case nowadays as well). Joost Kircz hit the spot in his speech (in my opinion): Are we going Beyond the pdf or Beyond the essay? An enhanced pdf is still stuck on the page paradigm. Papers represent structured or randomized knowledge that should be browsed, and that is often not possible in a book. I liked his ending statement: “Publishing is not a science, but is a craft”. Lisa Girard followed with StemBook, a portal where all the authors could keep their findings up to date, allowing the community to review their work in stem cell biology. An interesting thing about it is that people could upload their protocols and annotate them using Domeo, aligned with the Annotation Ontology. Paolo Ciccarese followed providing an overview of that ontology, summarizing their efforts and collaboration in the community in order to come up with a highly adopted standard.
As a small comment to this session, I think it is a bit curious that so many finished (or nearly finished) tools were presented in a “Vision” session. It would have been interesting to see how some of the presenters picture the future of publication and how to get there (either by using some of the presented tools or not).
After lunch there was a session on new models for content dissemination, where Theodora Bloom started stating very clearly what the main current problems are for dissemination:
Access to what you want to read and use
Publication venue as a measure of quality.
Having to repeat the cycle of publication in different journals
Poor links for underlying data.
She also explained how in Plos One the research leading to negative results is also published, but hardly anyone submits. I really liked this, it reminded me of a quote from Thomas Edison: “I have not failed. I’ve just found 10,000 ways that won’t work”. If an idea looks promising but doesn’t work as expected, it’s important to share it with the community so as to avoid someone else to repeat the same mistake. Who knows, it might even inspire other people to come up with a better solution.
Brian Hole followed talking about metajournals and the social contract of science, combining it in the idea of an Ultrajournal.
The second part of the session was introduced by a lively Jason Priem, who talked about how the printing press had been the first revolution for disseminating content and the Internet the second one. According to him, we should mine the network in order to produce the appropriate filters for the information. Keith Collier followed introducing Rubriq, an independent peer- review system that aims to decouple the peer review from the publication. Next, Kaveh Bazargan showed the current concern about type setters, and how we should get rid of them. Instead, XML or blog post should be the current type setters, giving more freedom to the writer. Finally Peter Bradley talked about Hypothes.is, an open source platform for the evaluation of information, and Alf Eaton introduced PeerJ, an open access peer reviewed journal with metadata for all their papers.
The final session of the day was about the business case, where three representatives explained different business models and three stakeholders plus the audience asked questions about them. Wim van der Stelt argued that in Springer they are not resisting to the change and Mark Hahnel defended the authors to be able to receive credit for their data as it happens in FigShare. The discussion brought some interesting topics to the table, such as that scholarly communication per se is not profitable and we need government funding, how to move from impact factor in journals to one that is meaningful (and convince the government to support it) or how to be able to share our work to those that don’t have the means to afford to pay it. Another important observation is the number of hours spent by researchers in rejected per year, which sums up to 11-16 millions!
The day ended with the session on demos and posters. Marco Roos and Aleix Garrido were by my side talking about the wf4ever project, while I spoke a bit about the work done reproducing the TB-Drugome workflow. The slides can be seen here.
Carol Teinoir started the day by trying to analyze and understand the needs of scholars. She gave a lot of metrics about the main reasons for scholars to not share their data (“I have not the time”, or “I’m not required to” were among the top five), and how successful researchers turn up to read more. She also gave metrics on who is sharing data versus who is willing to share their data, and analyzed how the e-books had influenced the printed pdf copies. An interesting fact: in Australia, e-books have almost replaced written copies.
The “Making it happen” session was next. Asunción Gómez Pérez talked about the SEALS evaluation platform, which allows reproducing the different tests of an experiment automatically. Graeme Hirst spoke about usability, the “neglected dimension” and how we are “forced” to use low usable systems like Word and Latex. The gain should be greater than the pain when writing a paper.Rebecca Lawrence followed talking about data review and how to share data: the requirement of a data sharing plan, how things should be done according to standards, where do we find the funding for the previous 2, how we should refuse the papers where data is not accessible, and how a reviewer should have access to all the materials in order to properly review the paper.
The session finished with several short presentations that can be accessed here. Anita de Waard insisted on the idea of the need of a new rewarding system, although no further details were given. I also liked the talk by Melissa Haendel on reproducibility on science, even if she didn’t talk about the role of scientific workflows in reproducibility. Another interesting tool was ORCID, a registry for scholars with author disambiguation. Gully Burns ended the session analyzing how the different parameters change an experiment.
We broke out in different sessions during lunch. I went to the reproducibility, where we shared the different issues that currently exist for trying to store and rerun experiments. However, unlike the data citation group we didn’t come up with a manifesto.
The next session dealt with the new models for evaluation of research, where the organizer, Carole Goble, proposed a little role play. Each of the 6 participants wore a different hat representing the role of their institution. Phil Bourne was the institutional dean (officer hat), Victoria Stodden (with the typical English bureaucratic hat on the right of the picture) represented the public funding agencies, Christine Borgman represented the digital libraries (second hand cowboy hat), Jan Reichelt with the “cool” hat on the left represented the commercial funders; Scott Edmunds representing publisher role with a top hat (unfortunately he wasn’t wearing it in the picture) and Steve Pettifer represented the academic role, (can’t be seen properly on the picture).
The summary of the discussion was as follows, for each role:
Funding agencies: they are not interested in the evaluation of the academic research. It should be driven by the community.
The dean: I’ll quote the acting by Phil:
“Oh, we have produced a 200 page report about the possible changes that we could do to the system.
– And what are you going to change?
– Very little!”
It’s events like this one the ones that provide the new ideas.
Publishers and academia: death to impact factor.
Commercial funders: code and methods matters. They should be brought as first class citizens (I couldn’t agree more).
Digital libraries: The standards are problematic. Tools don’t connect, and interoperability is an issue.
The final session, Visions for the future, grouped a set of flash talks from very different people. The most successful ones were given by Carole Goble (winner), who compared the publication of data from a software engineering perspective, and how we could do several releases of the data as happens in software releases: “Don’t publish, release!”; Stian Haklev with his proposal to create an alternative for Google Scholar (I liked his answer to Ed Hovy, when he asked what was new in his proposal: “There is nothing new about this, and that is precisely what is new, that we are just able to make it”); JeffreyLancaster with his proposal to change the CSL citation styles and Kaveh Bazargan, who demanded the publishers to release the XML of the papers instead of the pdf. The job of a publisher should be to disseminate content, and not to dictate us how to read the papers. He even did an online demo of a tool that could show the pdf in several different ways depending on the user preferences from the XML.
I also found interesting the proposal by Alejandra Gonzalez-Beltran, who talked about isa-tools, a platform used by pharmaceutical companies for the collection, curation and reuse of datasets; and of course the idea of Olga Giraldo, who wants to provide the means to transform laboratory protocols as nanopublications and provide checklist to organize them properly. Below you can see a picture of the participants in the session:
And that’s all! I think that in summary it was a nice event with a lot of discussion and claims from academia to editors, publishers and funding agencies. Of course, I guess that part of the motivation of the workshop is for them to take ideas on how the system could be changed plus a state of the art of different tools and platforms that they could incorporate to their systems.
Results, next steps?
There was a lot of debate but no session for what the next steps should be. I think this would have been an interesting thing to have, although it is difficult to have it all in a 2-day event. As results, part of the people participating in the breakout sessions wrote the “Data citation manifesto”, which I would really like people to follow in order to give credit for their data (link here, please share!).
Also the idea of an open Google Scholar (as an open alternative such as open Street maps is to Google Maps) looks promising. I hope it gets implemented!
And finally, some personal thoughts. After attending the event I realized that as a computer scientist working to enable reproducibility and reusability of other people’s work, sometimes in my own area we don’t follow the reproducibility principles: papers about tools that are not available after a while, published algorithms without an implementation, , unstable links, etc. I have always tried to include a reference to the code and evaluations done in my work for the reviewers to access it, but I might start using some of the tools shown in the workshop for the sake of preservation.