Context: Recently, more and more developer communities are abandoning their legacy support forums, moving onto Stack Overflow. The motivations are diverse, yet they typ-ically include achieving faster response time and larger vis-ibility through the access to a modern and very successful infrastructure. One downside of migration, however, is that the history and the crowdsourced knowledge hosted at pre-vious sites remain separated or even get lost if a community decides to abandon completely the legacy developer forum. Goal: Adding to the body of evidence of existing research on best-answer prediction, here we show that, from a techni-cal perspective, the content from existing developer forums might be automatically migrated to the Stack Overflow, al-though most of forums do not allow to mark a question as resolved, a distinctive feature of modern Q&A sites. Method: We trained a binary classifier with data from Stack Overflow and then tested it with data scraped from Do-cusign, a developer forum that has recently completed the move. Results: Our findings show that best answers can be pre-dicted with a good accuracy, only relying on shallow linguis-tic (text) features, such as answer length and the number of sentences, combined with other features like answer upvotes and age, which can be easily computed in near real-time. Conclusions: Results provide an initial yet positive ev-idence towards the automatic migration of crowdsourced knowledge from legacy forums to modern Q&A sites.

Moving to Stack Overflow: Best-Answer Prediction in Legacy Developer Forums

CALEFATO, FABIO;LANUBILE, Filippo;NOVIELLI, NICOLE
2016-01-01

Abstract

Context: Recently, more and more developer communities are abandoning their legacy support forums, moving onto Stack Overflow. The motivations are diverse, yet they typ-ically include achieving faster response time and larger vis-ibility through the access to a modern and very successful infrastructure. One downside of migration, however, is that the history and the crowdsourced knowledge hosted at pre-vious sites remain separated or even get lost if a community decides to abandon completely the legacy developer forum. Goal: Adding to the body of evidence of existing research on best-answer prediction, here we show that, from a techni-cal perspective, the content from existing developer forums might be automatically migrated to the Stack Overflow, al-though most of forums do not allow to mark a question as resolved, a distinctive feature of modern Q&A sites. Method: We trained a binary classifier with data from Stack Overflow and then tested it with data scraped from Do-cusign, a developer forum that has recently completed the move. Results: Our findings show that best answers can be pre-dicted with a good accuracy, only relying on shallow linguis-tic (text) features, such as answer length and the number of sentences, combined with other features like answer upvotes and age, which can be easily computed in near real-time. Conclusions: Results provide an initial yet positive ev-idence towards the automatic migration of crowdsourced knowledge from legacy forums to modern Q&A sites.
2016
978-1-4503-4427-2
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11586/169849
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 5
social impact