Context: Recently, more and more developer communities are abandoning their legacy support forums, moving onto Stack Overflow. The motivations are diverse, yet they typ-ically include achieving faster response time and larger vis-ibility through the access to a modern and very successful infrastructure. One downside of migration, however, is that the history and the crowdsourced knowledge hosted at pre-vious sites remain separated or even get lost if a community decides to abandon completely the legacy developer forum. Goal: Adding to the body of evidence of existing research on best-answer prediction, here we show that, from a techni-cal perspective, the content from existing developer forums might be automatically migrated to the Stack Overflow, al-though most of forums do not allow to mark a question as resolved, a distinctive feature of modern Q&A sites. Method: We trained a binary classifier with data from Stack Overflow and then tested it with data scraped from Do-cusign, a developer forum that has recently completed the move. Results: Our findings show that best answers can be pre-dicted with a good accuracy, only relying on shallow linguis-tic (text) features, such as answer length and the number of sentences, combined with other features like answer upvotes and age, which can be easily computed in near real-time. Conclusions: Results provide an initial yet positive ev-idence towards the automatic migration of crowdsourced knowledge from legacy forums to modern Q&A sites.
Moving to Stack Overflow: Best-Answer Prediction in Legacy Developer Forums
CALEFATO, FABIO;LANUBILE, Filippo;NOVIELLI, NICOLE
2016-01-01
Abstract
Context: Recently, more and more developer communities are abandoning their legacy support forums, moving onto Stack Overflow. The motivations are diverse, yet they typ-ically include achieving faster response time and larger vis-ibility through the access to a modern and very successful infrastructure. One downside of migration, however, is that the history and the crowdsourced knowledge hosted at pre-vious sites remain separated or even get lost if a community decides to abandon completely the legacy developer forum. Goal: Adding to the body of evidence of existing research on best-answer prediction, here we show that, from a techni-cal perspective, the content from existing developer forums might be automatically migrated to the Stack Overflow, al-though most of forums do not allow to mark a question as resolved, a distinctive feature of modern Q&A sites. Method: We trained a binary classifier with data from Stack Overflow and then tested it with data scraped from Do-cusign, a developer forum that has recently completed the move. Results: Our findings show that best answers can be pre-dicted with a good accuracy, only relying on shallow linguis-tic (text) features, such as answer length and the number of sentences, combined with other features like answer upvotes and age, which can be easily computed in near real-time. Conclusions: Results provide an initial yet positive ev-idence towards the automatic migration of crowdsourced knowledge from legacy forums to modern Q&A sites.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.