Author Topic: Xanadu II Translation Development Blog  (Read 39690 times)

elmer

  • Hero Member
  • *****
  • Posts: 2153
Re: Xanadu II Translation Development Blog
« Reply #45 on: September 24, 2015, 02:25:52 PM »
Offtopic:

Not really, I'd say ... it's a "blog" about the programmer's side of trying to get a translation done.  :)

I'm trying to give people an insight into what-it-is that the programmer has to do to get the "translator's" work into a game.

That might give people a better idea of why most translations need a programmer's help, and why these things often get stalled or abandoned.

IMHO anything programming-related is fair-game. Even if it doesn't directly apply to a game, it'll probably apply to the tools that may need to be written.


Quote
I learned C++ first, or C with OOP (back in '96), and then switched over to straight C. To this day, I still don't know what all the fuss is about with classes and objects. I should know this, being a computer science major, but I won't be taking any CS courses into after my gen ed are out of the way.

I'm sure that the you'll be taught all sorts of wonderful stories about how OOP is wonderful, and how to program "properly" ... usually by someone that's never worked on a large software project, or had to maintain someone else's code.

Be skeptical!  :wink:

IMHO, there are some really good things that you can do with OOP techniques, but if you get "religious" about it, you can easily over complicate things.

This site is a fun read after you've had someone evangelizing the benefits of C++ to you ... http://yosefk.com/c++fqa/

Personally, I mostly write C code with the C++ compiler, but there are just some things that are much easier to express with classes.

But, like most old game devs, I avoid std:: and templates like the plague!


Quote
The only thing that annoyed me with C (C99), was that I couldn't created a struct, which contained an array and a set of pointers with offsets into that array. The runtime thingy in C, or whatever it's called, won't initialize the pointers. I don't see why not; those pointers should be relative to the array in which they all belong to (the struct).

I'd be curious to see what you were trying to accomplish with that struct?

TailChao

  • Full Member
  • ***
  • Posts: 156
Re: Xanadu II Translation Development Blog
« Reply #46 on: September 25, 2015, 07:00:19 AM »
Not really, I'd say ... it's a "blog" about the programmer's side of trying to get a translation done.  :)
Just wanted to chime in and say I'm really enjoying this.

It's extremely helpful to "compare notes" with how others structure their games, especially in regards to scripting and compression.

dshadoff

  • Full Member
  • ***
  • Posts: 175
Re: Xanadu II Translation Development Blog
« Reply #47 on: September 25, 2015, 01:39:54 PM »
Personally, I mostly write C code with the C++ compiler, but there are just some things that are much easier to express with classes.

Same here.
A good enough programmer is going to write code that has a beginning, a middle and an end, and is divided into all the necessary pieces neatly - no matter what language.  Kind of like the dream that C++ tries to sell.

On the other hand, a careless C++ programmer will write mangled crap anyway, no matter how much of the Kool-Aid they drank.

Quote
But, like most old game devs, I avoid std:: and templates like the plague!

Or the clap.  "STD" only meant "sexually transmitted disease" when I was younger.  Then C++ came along and made it mean something equally unpalatable.

-Dave

NightWolve

  • Hero Member
  • *****
  • Posts: 5277
Re: Xanadu II Translation Development Blog
« Reply #48 on: September 26, 2015, 03:25:23 PM »
Or the clap.  "STD" only meant "sexually transmitted disease" when I was younger.  Then C++ came along and made it mean something equally unpalatable.

-Dave

HAHAHAHAHAHA!

flame

  • Newbie
  • *
  • Posts: 1
Re: Xanadu II Translation Development Blog
« Reply #49 on: September 29, 2015, 02:28:49 PM »
I looked on RomHacking, and someone there was asking about hacking one of Falcom's PSP games, and it still seemed to be using the FALCOM2 compression scheme.
I guess I did that. It's like halfway between a straight copy of the MIPS code and a functional copy of the algorithm. I'm not smart enough to do anything more. I worked on Nayuta which used a little more complicated version of what you're calling FALCOM2. Trails in the Sky 3 PC was using straight FALCOM2.

I don't claim to be a good programmer. I can only do simple stuff. I like Python and can't understand C code. It has all those brackets {} and things. Also you have to understand C builtin functions and I'm not sure I do. I get that Python is slow. Unless you're doing a decryption algorithm though, it doesn't have to be fast, at least for Romhacking work.

Beginning, middle and end: My programs tend to follow: input, data processing, output. Is that what's meant?

elmer

  • Hero Member
  • *****
  • Posts: 2153
Re: Xanadu II Translation Development Blog
« Reply #50 on: September 30, 2015, 06:08:43 AM »
I have no idea what most of this means, but it's interesting reading...

I'm just glad if it's not sending everyone to sleep!  :wink:

***********************

It's extremely helpful to "compare notes" with how others structure their games, especially in regards to scripting and compression.

I totally agree ... it's one of the big reasons that I'm doing this.

I've often learned a lot from looking at how other people put their games together.

Everyone used to do that back-in-the-day when someone came up with a particularly "clever" effect.

***********************

Or the clap.  "STD" only meant "sexually transmitted disease" when I was younger.  Then C++ came along and made it mean something equally unpalatable.

It's amazing, to me, at just how all that templated junk and the overuse of class inheritance totally kills compiler performance.

I took a look at the Unreal Engine when they made it cheaply available a year or so back.

There's definitely some powerful stuff in there ... but OMG, it was so slow to compile!  :shock:

It took my old quad-core PC nearly 60 minutes to do a clean compile of their codebase.

It takes the same machine approx 3 minutes to do a clean compile of my last X360 game.

***********************

I worked on Nayuta which used a little more complicated version of what you're calling FALCOM2. Trails in the Sky 3 PC was using straight FALCOM2.

Welcome!

Congratulations on getting those games decompressing, it's interesting to hear about the progression of Falcom's compression over the years.

Are those translations released, yet?


Quote
Beginning, middle and end: My programs tend to follow: input, data processing, output. Is that what's meant?

I'm going to guess that Dave is talking about classic "procedural" or "functional" programming.

That's where you can generally look at the source code and see the flow of the program execution.

Once you get into too deep into some of the tricks that "object-oriented" programming makes look simple, like event handlers, and lists/trees/etc of objects running update/collision/etc methods, then things quickly become hard to follow, and harder to debug.

When you add in multiple threads on modern systems, then the complexity ramps up by orders-of-magnitude.

Things very quickly get to the point that only the person/people that originally wrote it have any chance to debugging it. And as modern games show, even they can't catch a lot of the problems.

NightWolve

  • Hero Member
  • *****
  • Posts: 5277
Re: Xanadu II Translation Development Blog
« Reply #51 on: September 30, 2015, 06:59:28 AM »
I worked on Nayuta which used a little more complicated version of what you're calling FALCOM2. Trails in the Sky 3 PC was using straight FALCOM2.

That's interesting, so they went back to their LZSS stuff and didn't use the general zlib like they did for Ys VI/Felghana/Origin ? Huh. The nice thing with zlib as a standard was you grab the public DLL and you've got both encode/decode functions, you wouldn't just be looking at their decode function in x86 ASM and then have to study it enough to reverse it...

Another thing that Falcom are bastards for was their decision after Ys II Compete to begin splitting the base script file into over a thousand! It varied between 1300 to 1600 text files when it came to Ys VI/Felghana/Origin/etc. ED6 FC used about 600 as I recall. But yeah, it seemed a nightmare at first given you'd have to rebuild all of them and it took me a while to come up with a clever solution to handle it. Which then was not so much clever seemingly as it was obvious, just that I'm a slow learner. Heh. I paid a great many penalties in lost time just starring at the screen. ;)
« Last Edit: September 30, 2015, 07:02:48 AM by NightWolve »

elmer

  • Hero Member
  • *****
  • Posts: 2153
Re: Xanadu II Translation Development Blog
« Reply #52 on: September 30, 2015, 08:05:03 AM »
But yeah, it seemed a nightmare at first given you'd have to rebuild all of them and it took me a while to come up with a clever solution to handle it.

I'd be curious to see a snippet from one of those script files, if you'd like to post one.

***********************

I'm beginning the process of dumping the Xanadu 2 scripts now, and it's already showing that the script really is a fully-fledged programming language.

Much to my disappointment, they're also interleaving "script" code with "assembler" code.

That means that I may have to add a complete HuC6280 assembler/disassembler into the translation tool!  ](*,)

At this point I'm very curious as to how Falcom originally wrote this whole thing.  :-k

I'd guess that it was all done with a good macro-assembler ... but if so, it was either a custom-developed one in order to deal with their text-encoding system, or they did a lot of cut-n-paste with both SJIS and "encoded" text string in the same file.

Anyway, here's just a small section of the very first script chunk in the game ...

$a6a3 .scriptA6A3:
$a6a3   _enable_8x12_font()
$a6a5   _set_pen_then_call_then_eol( orange, .scriptAE05 )
$a6a9   _disable_8x12_font()
$a6ab   _tst_2b03_x_bnz( $01, $02, .scriptA877 )
$a6b0   _tst_2b03_x_bnz( $01, $10, .scriptA748 )
$a6b5   _tst_2b03_x_bnz( $01, $20, .scriptA71B )
$a6ba   _tst_2b03_x_beq( $20, $01, .scriptA8AB )
$a6bf   _tst_2b03_x_beq( $20, $02, .scriptA8AB )
$a6c4   _tst_2b03_x_beq( $20, $04, .scriptA8AB )
$a6c9   _tst_2b03_x_beq( $20, $08, .scriptA8AB )
$a6ce   {アリオスさま、準備が整ったようですな。そういえば、航海長が}
$a6f1   _eol()
$a6f2   {お話があるとのことです。}
$a6ff   _wait_for_keypress_then_clear()
$a700   {後部甲板に行ってみてはいかがですか?}
$a717   _set_bits_2b03_x( $01, $20 )
$a71a   _wait_for_keypress_then_end()


NightWolve

  • Hero Member
  • *****
  • Posts: 5277
Re: Xanadu II Translation Development Blog
« Reply #53 on: September 30, 2015, 08:54:38 AM »
But yeah, it seemed a nightmare at first given you'd have to rebuild all of them and it took me a while to come up with a clever solution to handle it.
I'd be curious to see a snippet from one of those script files, if you'd like to post one.

Sure, let's pick one from Felghana. So the script was expanded out to 1,766 .XSO files when they used to use 1 or 2 files for Ys I & II Complete... Not every XSO has S-JIS text in it though, so you can eliminate a hundred or so that don't have it. But yeah, I always wondered why they did that, if it was intentional to possibly make the job of fan translation tougher...



So that's after it was ZLIB decoded/decompressed - the files had a .Z extension if compressed.

Here's the whole file:

http://www.mediafire.com/download/hqzh8xd9hedzmdu/TALKRANDOLF.XSO

I dunno if you have a hex editor with S-JIS-to-Unicode mapping to allowing easy viewing so I took that little snapshot.

So, I took the easy route here in the aftermath, just scanned for S-JIS lead byte and 2nd byte pairs and loaded that as a string till null, repeat, etc. I escaped having to rebuild any of these files since I came up with the idea of intercepting the print function to crunch the current Japanese string to a CRC32, take that as a 4 byte index and match it to these but return the English replacement I would have next to it in a database record, etc.

So in the database, you'd store the FileID, Offset, CRC32, Japanese string, and English string (after your translator did his/her job), etc. and then output that data as arrays in a "C" header file for compilation/usage. One array for the CRC32 and another for the English string, both sorted by CRC32. So when you search for a CRC32 based on what the print function was about to do, the index you find it at is the same index that'll fetch the English string in the other array. Blah blah, you get the idea.

It was pretty cool how it all worked out like a charm. I wondered if there'd be a detectable slowdown in implementing this though, but you couldn't tell the difference in the slightest bit! I did sort the CRC32, English String pairs by the CRC32 so I could use binary search instead of linear 1-to-n max iteration searches as a novice would do, but I'd bet even if I cheaped out and did a basic for loop for a linear search, I still wouldn't have noticed any difference because something like that with only 4,000 to 6,000 4-byte elements shouldn't have been much of a big deal.

I DID cause a detectable slowdown in another area though for image replacement, which I never got to correct in a released patch! But that's another paragraph or so in your thread. ;)
« Last Edit: May 06, 2018, 03:16:43 PM by NightWolve »

elmer

  • Hero Member
  • *****
  • Posts: 2153
Re: Xanadu II Translation Development Blog
« Reply #54 on: October 01, 2015, 05:29:05 AM »
But yeah, I always wondered why they did that, if it was intentional to possibly make the job of fan translation tougher...

Hahaha ... they could care less!  :lol:

They'll have done it for their own reasons, because it made sense at the time.

Probably so that they could have multiple designers working on different parts of the game at the same time.


Quote
I dunno if you have a hex editor with S-JIS-to-Unicode mapping to allowing easy viewing so I took that little snapshot.

Thanks, I took a quick look at that .xso file.

So there's a bunch of data (and script code?) at the start, and the whole thing ends with the string data.

The string data consists of a table of offsets to each string, and then the strings themselves in regular C format.

That seems like a very standard sort-of-thing for the 32-bit era when writing the game in "C".

It's nice that all of the string data is right at the end of the file ... that would have made it about as trivial to hack/replace as you can possibly get!

But your DLL hack is a really nice solution that avoids changing too many of the original files and bloating up the size of the patch.

Unfortunately, the Xanadu 2 patch is likely to be another "windows-executable" style patch, since almost all of the game data is going to get re-compressed.

I'm still curious what the PCE YsIV data looked like ... the 16-bit era was when developers were still coming up with "creative" solutions in order to fit things into the limited RAM/ROM.

Bonknuts

  • Hero Member
  • *****
  • Posts: 3292
Re: Xanadu II Translation Development Blog
« Reply #55 on: October 02, 2015, 06:04:19 AM »
Do any of the Xanadu games prime the LZ window/buffer before decompression? I can't remember if it was Dracula X or Gate of thunder, or some other PCECD game, but the game would prime the buffer with a series of values before running the decompression routine. Beginning/leading referencing strings would rely on the presence of these values (not just cleared or zero'd data in the buffer).

elmer

  • Hero Member
  • *****
  • Posts: 2153
Re: Xanadu II Translation Development Blog
« Reply #56 on: October 02, 2015, 09:06:35 AM »
Do any of the Xanadu games prime the LZ window/buffer before decompression?

Not unless I'm totally missing something!  :wink:

From what I'm seeing, the game loads up a complete 128KB META_BLOCK into RAM, and then when it wants to decompress an 8KB DATA_CHUNK, it maps the appropriate section of the META_BLOCK into $8000-$BFFF, and then decompresses it into $C000-$DFFF.

That memory layout pretty much stops then from using the preload trick.

elmer

  • Hero Member
  • *****
  • Posts: 2153
Re: Xanadu II Translation Development Blog
« Reply #57 on: October 08, 2015, 10:01:10 AM »
It's been a while, so time for an update.

The "script" code seems to be all extracted ... but that's not much use if it can't be modified and replaced.

The problem is that there's a lot of interleaved script code and assembly code ... there's even some bits of "dead" code and script in there!

That makes me absolutely certain that this was all created with a macro-assembler and not a "level editor".

I've written an HuC6280 disassembler and am now running that as part of the script-extraction.

It was actually quite fun to go "old-skool" with that and try to get it as small as possible so that I can have a version of it that runs in-game on the PCE, just like Chris Covell's excellent PCEmon. I think that it should fit into approx 1024 bytes (hopefully less) on a PCE, including instruction cycle counts.

***********************

AFAIK (and I'd love to know if I'm missing some other alternative), there are only 3 basic strategies for changing the text in a translation ...

[uldecimal][li]Just overwrite the existing text and only allow strings the same size or shorter than the original.[/li][li]Change the "pointer" to the string to point to your translated string that's somewhere else.[/li][li]Reassemble the original code/script from "source" with the new translated strings, just like the original developers would have done.[/li][/ul]
Given the lack of free memory in the PCE, I've been thinking that option 3 is probably the best thing to do, especially since it imposes the least limits on the translator.

But the way that Falcom are mixing code and script makes this problematic ... for a start, I've actually got to reverse-engineer the script chunks back into a "source" format that I can either feed to PCEAS, or assemble/compile myself.

That's complicated by not knowing exactly where the code/script/data is in a chunk, and having to try to figure it out from various clues.

***********************

Which leads us on to this example from the very first script chunk.

We've got script that calls an assembly language function, that's next to other code that references a data table, and is followed by yet more script.

That's ugly ... but it's just about OK.

It does mean that I need to output this all in a format that some macro-assembler can handle.


$ad89   _set_pen_then_call_then_eol( orange, .scriptAE17 )
$ad8d   {アイアイサー!}
$ad94   _call_asm_from_script( .codeADFF )
$ad97   _wait_for_keypress_then_end()

.....

$ade6 .codeADE6:
$ade6   lda  .dataADFA,y
$ade9   sta  $2700,x
$adec   lda  #$20
$adee   jsr  $8a63
$adf1   iny 
$adf2   cpy  #$05
$adf4   bcc  .codeADE6
$adf6   jsr  $7feb
$adf9   rts 

$adfa .dataADFA:
$adfa   _byte( $08 )
$adfb   _byte( $00 )
$adfc   _byte( $0a )
$adfd   _byte( $00 )
$adfe   _byte( $0d )

$adff .codeADFF:
$adff   lda  #$01
$ae01   trb  $2c00
$ae04   rts 

$ae05 .scriptAE05: ; 8x12 font
$ae05   {ダイモス}
$ae09   _end()


***********************

Next up, here's some old-fashioned self-modifying code with a jump table.

This disassembly has been hand-tweaked, because it's something that I still need to write a specific disassembler-helper function to actually get it into a usable format.


$a442 .codeA442:
$a442   lda  $26c0,x
$a445   asl  a
$a446   tay 
$a447   lda  .tableA487+0,y
$a44a   sta  .dataA454
$a44d   lda  .tableA488+1,y
$a450   sta  .dataA455
$a453   jmp  $0000

$a487 .tableA487:
$a487   _eptr( .codeA456 )
$a489   _eptr( .codeA460 )
$a48b   _eptr( .codeA469 )
$a48d   _eptr( .codeA473 )


***********************

So ... there's definitely progress, but it's slow going.

NightWolve

  • Hero Member
  • *****
  • Posts: 5277
Re: Xanadu II Translation Development Blog
« Reply #58 on: October 08, 2015, 12:48:14 PM »
Yeah, games like Xak III had 16-bit pointers, sometimes a bit before the text block, sometimes after, so I would load the array after spotting it, and could then recompute each pointer to pack as much English text back into the text block, so you weren't limited by the original string size, you just had to mind the whole text block size and not go over it. In this way, you pretty much were able to fit accurate translations for every string in the block and not have to trim them to the point where loss of quality had to occur. (At least, that was the experience with Xak III.)

With a compressed text block, it's a whole other beast in how it operates. As far as I know, the way it works is the game code specifies an index based on the string that it wants at the time. So, if it wants the 5th string in the block, it specifies say 4 (if we're starting at 0) and so it keeps decompressing while counting the 0/null terminators, so when you've counted the 4th null terminator, that's the end of string 4, the start of string 5, and then it knows to finish off with that string and stop further decompression into the block. Something like that.

EDIT:
I'm still curious what the PCE YsIV data looked like ... the 16-bit era was when developers were still coming up with "creative" solutions in order to fit things into the limited RAM/ROM.

Oh right, about your Ys IV question, you basically saw it in that image of S-JIS. A decompressed text block was just null-terminated S-JIS text, that's it! No switching tricks with half-width characters, hiragana, etc. and what not. Just all S-JIS all the time... The game that uses switching tricks is Emerald Dragon and David did extensive work to decode it all to where what I and what SamIAm sees is S-JIS which was converted to Unicode for easier viewing on a Windows desktop.
« Last Edit: October 09, 2015, 02:04:36 AM by NightWolve »

elmer

  • Hero Member
  • *****
  • Posts: 2153
Re: Xanadu II Translation Development Blog
« Reply #59 on: October 09, 2015, 06:33:37 AM »
Yeah, games like Xak III had 16-bit pointers, sometimes a bit before the text block, sometimes after, so I would load the array after spotting it, and could then recompute each pointer to pack as much English text back into the text block, so you weren't limited by the original string size, you just had to mind the whole text block size and not go over it.

It's so nice when a developer uses a nice-and-simple scheme like that, it really makes a programmer's life so much easier.

The Zeroigar scripts were basically like that.


Quote
With a compressed text block, it's a whole other beast in how it operates. As far as I know, the way it works is the game code specifies an index based on the string that it wants at the time.

Now that's just plain slow and fugly!  :shock:

I've not seen that trick done before.

I can just-about imagine that being used for a HuCard game on the PCE (becuase of it's limited memory), but it's horrible!

At least it should be fairly easy to translate since you've only got to worry about overall size of the complete block of compressed data.


Quote
Oh right, about your Ys IV question, you basically saw it in that image of S-JIS. A decompressed text block was just null-terminated S-JIS text, that's it! No switching tricks with half-width characters, hiragana, etc. and what not. Just all S-JIS all the time...

That was nice of them.


Quote
The game that uses switching tricks is Emerald Dragon and David did extensive work to decode it all to where what I and what SamIAm sees is S-JIS which was converted to Unicode for easier viewing on a Windows desktop.

Haha ... yes, you definitely want to hide the behind-the-scenes lunacy away from the poor translator.

Xanadu 2 uses a byte-to-sjis conversion table ... actually 2 of them, 1 for 12x12 glyphs and 1 for 8x12 glyphs.


***********************

Anyway ... back to Xanadu 2.

There are 8 really large script-chunks that I've been concerned about, because they're nearly 8KB big, but only seemed to contain about 1KB of "script".

That immediately made me concerned that I was missing something important.

Now that I've finally been able to disassemble the whole chunk, it turns out that I wasn't missing much, and that there really is a lot of (ugly) code in those particular chunks.

They're the ones that handle the 8 different Weapon Shops in the game.

The good news is that this all means that it's time to write the insertion tools and start testing a chunk with some real translated text.   :D