You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While doing MT Taja noticed that quite a lot of the text in the FI transcriptions is in fact in Swedish but is not marked as such.
This is esp. bad for MT, as it is applying the Finnish model to the text marked as Finnish, which here includes the Swedish text. The result is that the Swedish text remains untranslated.
A quick count (as the Swedish words in the MTed corpus are analysed as unknow PoS, i.e. 'X') shows that this affects 1,743,576 (7.6%) of the tokens.
Obviously this can't be corrected for 4.0, so setting it to the Future milestone.
The text was updated successfully, but these errors were encountered:
While doing MT Taja noticed that quite a lot of the text in the FI transcriptions is in fact in Swedish but is not marked as such.
This is esp. bad for MT, as it is applying the Finnish model to the text marked as Finnish, which here includes the Swedish text. The result is that the Swedish text remains untranslated.
A quick count (as the Swedish words in the MTed corpus are analysed as unknow PoS, i.e. 'X') shows that this affects 1,743,576 (7.6%) of the tokens.
Obviously this can't be corrected for 4.0, so setting it to the Future milestone.
The text was updated successfully, but these errors were encountered: