SkyrimStringTranslator

Post » Tue Jan 01, 2013 10:45 am

There already are few translation tools , but I needed something that allows me to re-use translation between sessions and mod updates.
Skyrim String Localizer (SSL) was doing that very well at some point, but, unfortunately, it's now producing corrupted esp, and became unreliable... :/

With TesVedit,(big thanks to you!!) it's now possible to delocalize any esp safely, so, this is my take on this kind of tool.


I uploaded the tool on nexus: http://skyrim.nexusmods.com/mods/29148
it's the very first public version, in early beta stage, but already functionnal (I am using ingames, french strings produced by this tool without issues so far).

here is the read me (hard to do for me, since english is not my main language, it still needs to be expanded).
I hope this will help, I am open to any suggestion.


-----------------------------------------------------------

SkyrimStringTranslator is an advanced editor for translate mod strings files that come with delocalized esp (STRINGS, DLSTRINGS, ILSTRINGS).


SST is built with automatic translation logic in mind, and allows easy translation updates.
This means that the best way to use SST, is at first to build a dictionnary with a source language and a dest language. So far, the best language source is 'English', because almost all mods are made in this language. So you will need all 3 skyrim_english.(S,DL,IL)Strings files in your Skyrim\data\Strings\ folder. (you can find these strings in the download section of SSL ( http://skyrim.nexusmods.com/mods/2889 ))
(SSL is another translation tool, (which was very good, I hope it will be again...) but it's now, unfortunately, producing corrupted esp and became unreliable, so I wanted to get an alternative.)


Here is a quick HowTo use SST:
Note: The UI is still a bit rough and needs to be improved. However it's already functionnal.
I assume the fact you have both batch of skyrim_english Strings files next to your native Skyrim language, as I said before.


For my example let say I want to translate the Unofficial Skyrim Patch, from english to french (I am french, you may have notice my bad english :wink:)

First I need to delocalize the Unofficial Skyrim Patch (http://skyrim.nexusmods.com/mods/19 )
The best way to get the strings files for any addon, is to use TesVEdit: ( http://skyrim.nexusmods.com/mods/25859/ )

With TesVedit
-Load the Unofficial Skyrim Patch.Esp
-once it's loaded, Right Click on the name of the esp to the left, in the menu context menu, choose Other-&--#62;Localization, check if 'Language' is set to english, then Choose -&--#62;delocalize -&--#62;YES -&--#62; dont check anything, then Save the files.

you now have a delocalized Unofficial Skyrim Patch.Esp and 3 Strings files in the Strings folder named
Unofficial Skyrim Patch_English.STRINGS
Unofficial Skyrim Patch_English.DLSTRINGS
Unofficial Skyrim Patch_English.ILSTRINGS
(note, if you try to launch skyrim at this point: it will crash, because the esp now doesn't have strings embedded anymore and you dont have the related STRINGS files with your native skyrim language)

SST will help you to translate those files, and also help you to update the next versions of this mod easier.

-launch SST
-in the menu, choose file -&--#62; setStringsDirectory, then pick a file in the Skyrim STrings directory (usually skyrim\data\strings) . no matter which file at this point, this is just to point where is that directory.

-in the 1st edit box, enter English , and the destination language in the 2nd Box (I choose French)
-click LoadCache in the vocabulary tab.
-if you have Skyrim_english.strings and skyrim_yourlanguage.strings in the stringsdirectory properly set then SST is building a vocabulary cache. (that takes few seconds.)
once it's done you should be able to browse all cached strings in the vocabulary list.

Now you can load the Unofficial Skyrim Patch_English.STRINGS you want to translate.
- in the menu, choose file -&--#62; load, then pick up Unofficial Skyrim Patch_English.STRINGS.
- Once it's loaded, go to the Autotranslation panel then click TranslateExactMatch.
- All strings with exact english match will be directly translated in your language. Untranslated strings now have a red background
- At this point, you can choose to directly translate the red string manually (select a line, press enter, edit, press ctrl-enter to valid), or try to do an heuristic translation.
- Press heuristic translation: SST will search throught similars strings and try to find the best match.
Note: since it takes more time, (depending of the addon, and the type of strings - it's usually slower for the ILSTRINGS - between 10s and 2-5mn) the heuristic translation only performs for the current visible list.
Once it's done you get some control: Best match are in green, and not-so-good matches are going to the red.

When you click on a match you can also get all alternative matches (if there are some), and you can select the more appropriate one.
If you think the string is completly wrong you can right click on it and set 'Cancel translation'. But if you think its good, you can choose 'Validate translation' (shortcuts S) and the line turns blue. At this point, this string can be included in the userCache (choose save userCache), and these Strings will be kept for any future updates.

The Unofficial Skyrim Patch is a big mod with a lot a strings to check, so that can take a before you can finish the task, but if you save your user cache, you now can reload Unofficial Skyrim Patch_English.STRINGS at anytime, and resume your work..

Once it's ok, just choose Save, and SST while save all file in your destination language, and proper encoding. (Encoding probably needs some testing in some language)
User avatar
Grace Francis
 
Posts: 3431
Joined: Wed Jul 19, 2006 2:51 pm

Post » Tue Jan 01, 2013 6:23 am

It's always good to see someone else's approach to implementing a tool.

However, I doubt that your encoding stuff is spot-on, simply because all the published data on it is wrong (even the stuff I wrote, it turns out...), and my investigations have thus far shown the situation to be a bit of a nightmare.

If you're willing to join forces to figure it all out, just have a skim through the last StrEdit thread (link in the http://www.gamesas.com/topic/1425251-rel-stredit's OP).

Also: source code?
User avatar
Nathan Hunter
 
Posts: 3464
Joined: Sun Apr 29, 2007 9:58 am

Post » Tue Jan 01, 2013 1:44 pm

@Wrinklyninja:
It seems that I missed your thread.
Someones talk to me about a Stringedit tool, on the tesVedit thread, and I found a very rough old tool on nexus that has this name, I didn't look further.

I dont know yet if my encoding is good for every language. It's working in french at least, at the moment.
However I dont use Utf8 by default (only if its forced.), but usual unicode (1252 -1521 etc...)
what do you means by: all the published datas on it is wrong?; i just tried to load one of my string on your tool (which is interesting), and it's ok.

the source code is coming, Nexus is a nightmare atm, i barely can update the readme :/
I also made a small update for to correct a silly bug on save file (that kind of bug you detect only after a first release :wink:)


Edit: SourceCode Uploaded.

Edit: Apparently I can load String produced by your StrEdit, only if I force UTF8. I dont get why you are saving everything in utf8 since original file are not in Utf8.
User avatar
Kathryn Medows
 
Posts: 3547
Joined: Sun Nov 19, 2006 12:10 pm

Post » Tue Jan 01, 2013 10:35 am

@Wrinklyninja:
It seems that I missed your thread.
Someones talk to me about a Stringedit tool, on the tesVedit thread, and I found a very rough old tool on nexus that has this name, I didn't look further.

I dont know yet if my encoding is good for every language. It's working in french at least, at the moment.
However I dont use Utf8 by default (only if its forced.), but usual unicode (1252 -1521 etc...)
what do you means by: all the published datas on it is wrong?; i just tried to load one of my string on your tool (which is interesting), and it's ok.

the source code is coming, Nexus is a nightmare atm, i barely can update the readme :/
I also made a small update for to correct a silly bug on save file (that kind of bug you detect only after a first release :wink:)


Edit: SourceCode Uploaded.

Edit: Apparently I can load String produced by your StrEdit, only if I force UTF8. I dont get why you are saving everything in utf8 since original file are not in Utf8.
Yeah, I haven't really been publicising StrEdit, since it's still in beta.

I mean that all the encodings data on the file format at http://www.uesp.net/wiki/Tes5Mod:String_Table_File_Format was wrong when I started writing StrEdit about a month and a half ago, and while I've been updating it as I investigate things, I've been finding that what I'd previously written is inaccurate. As of a few hours ago it is once more in sync with my latest findings, but there are still a lot of unanswered questions.

Thanks for the source, a pity I don't know Pascal. :P

StrEdit saves everything in UTF-8 because that's the encoding that the game seems to prefer, and it's easier just to know I can always save to that instead of having to convert into a multitude of encodings, with detection of characters that can't be converted, and the encoding selection logic, etc. It isn't a problem apart from the alias encoding issue, but that can also happen if you are using files encoded in two or more of the secondary encodings, so almost paradoxically the only total solution is for all strings to be encoded in UTF-8. There's just the not-inconsiderable matter of the vanilla string tables not using UTF-8 apart from Japanese. I'm working on it though...
User avatar
Shae Munro
 
Posts: 3443
Joined: Fri Feb 23, 2007 11:32 am

Post » Tue Jan 01, 2013 5:07 am

The problem with Utf8 is that there is no Bom to check.
If the same language file can be encoded in utf8 or in 1252 (or whatever), how the game knows which encoding to use?
does it tries to decode, re-encode and look for the match? that sounds silly and very not optimized...
User avatar
Danielle Brown
 
Posts: 3380
Joined: Wed Sep 27, 2006 6:03 am

Post » Tue Jan 01, 2013 11:30 am

The problem with Utf8 is that there is no Bom to check.
If the same language file can be encoded in utf8 or in 1252 (or whatever), how the game knows which encoding to use?
does it tries to decode, re-encode and look for the match? that sounds silly and very not optimized...
Because UTF-8 is a multi-byte encoding, and some byte sequences do not correspond to valid characters. If the game finds that a string contains a byte sequence that is not valid UTF-8, it uses the secondary encoding. Even if there was no secondary encoding, validation of the UTF-8 string would still have to take place. All that's really added is a conditional modification to the encoding used to interpret the string.

Saying that UTF-8 not having a BOM being bad is missing the point: UTF-8 was designed to be implemented as an ASCII-compatible (or superset/extension, whatever you like to call it) implementation of Unicode. If it had to have a BOM, it would make it incompatible with ASCII. Sure, it does complicate detection of UTF-8 in a string with no BOM when you're working with multiple encodings, but if you're working with having to detect multiple encodings, you have bigger fish to fry. Like trying to figure out which of the numerous single-byte encodings the string might be in, some of which have no invalid bytes at all.

Beth really should have just required all strings to be in UTF-8. Another Unicode encoding would have been fine too, but UTF-8 gives the best string character length:byte size ratio for Skyrim's character usage. Yeah, I could rant for hours on this...

Oh, and I've added a list of unanswered questions to the OP of the StrEdit thread, if you're interested.
User avatar
Daniel Brown
 
Posts: 3463
Joined: Fri May 04, 2007 11:21 am

Post » Tue Jan 01, 2013 7:28 am

Well, when i say "Bom", I could have say a "tag" or anything that can identify the encoding inside the file, and this is not the case, so we can just "guess" :/
For now, i think I will stick with the ansi (125x) encoding since it's the default one for official strings. (also SSL was using this, and was widely used without trouble on this side, as far as I know). However SkyrimStringTranslator works in Unicode internally.

Thank you for all your researches anyway. It's useful
User avatar
Frank Firefly
 
Posts: 3429
Joined: Sun Aug 19, 2007 9:34 am

Post » Tue Jan 01, 2013 8:03 am

Good to see someone working on a translator tool. I will be trying it out.
User avatar
Louise Andrew
 
Posts: 3333
Joined: Mon Nov 27, 2006 8:01 am


Return to V - Skyrim