[Translation] Script dumping gotchas

hu, cd, scd, acd, supergrafx discussions.
Post Reply
User avatar
MooZ
Site Admin
Posts: 407
Joined: Sun Jun 22, 2008 3:19 pm
Location: Lvl 3
Contact:

[Translation] Script dumping gotchas

Post by MooZ »

Hy people!

I know it may have been asked a zillion times on romhacking. But what are the usual process for dumping a script?
Do you start with a standard delta search ? By delta search I mean : take some text from the game, identify the ideogram, find their associated shift jis codes and finally search successive byte/word which difference equals to the shift jis code difference.
If this fail what do you usually do? Before mednafen it may have been a pain in the a*s. And I can't imagine searching for print functions without a debugger.
tomaitheous
Posts: 88
Joined: Mon Jun 23, 2008 1:58 pm

Re: [Translation] Script dumping gotchas

Post by tomaitheous »

I think it depends on the romhacker. I've always worked backwards from the vram upload routine (be it tilemap or just vram buffer). Then trace back to the string read routine. Then figure out the pointer system. See if I can identify the start and end of a specific pointer section/block. Then I manually hunt down for any other in the rom or ISO visually (using a range of viewing modes). CD games are usually easier since even compressed text is usually just SJIS. Roms tend to have their own special encoding system.
User avatar
MooZ
Site Admin
Posts: 407
Joined: Sun Jun 22, 2008 3:19 pm
Location: Lvl 3
Contact:

Re: [Translation] Script dumping gotchas

Post by MooZ »

And between tile based (Marchen Maze) and "on the fly" (Dead of the Brain, Valkyrie No Densetsu etc...) print function, which one did you find the easiest to hack?
tomaitheous
Posts: 88
Joined: Mon Jun 23, 2008 1:58 pm

Re: [Translation] Script dumping gotchas

Post by tomaitheous »

I'd have to say DotB, easily (being more problematic). Being a CD game makes it a problem right there. You don't know if the game will change code at any later point in time. 'Cause technically, a single game could have multiple code written for it (since it's a load ram environment). Or data/code won't always be in the same place. But the biggest problem on DotB, was running out of free memory to write the new print routine for. I had to compress the new font, but access was slow, so I had to speed up the sequential access to a font bitmap with 4bit length entry table (and that took memory for code and the table), so the 'seek' routine could rapidly skipped the existing compressed font block until it reached the one it needed - then decompressed that. I tried to break the existing print routine parts as little as possible (because in theory, the less you break - the less you have to deal with when something goes wrong. I.e. a bug). I left them as 12x12, but added hooks to keep the original vram pointer from incrementing at my code's request. My replacement font was, IIRC, 6x12. So I had an option to just do a normal FWF. But then we upgraded to VWF and that took much more code. Since I didn't completely break the original print routine, but added hooks instead, I need to add even more control mechanisms to keep things in check. It was a huge headache. And there wasn't enough space left for the last control mechanism. The game uses the last two tiles as 'blank' tiles. Rather than manually erasing vram. But the original game expect the text writer to 'wrap' on the vram bitmap block (I think that's about 3-4 lines of 12x12). So the only fix is to have the script issue a pause (so the read can read the text), and then do a clear screen (original game control code that updates the tilemap of the text section pointing to those last two tiles) - then continue printing the text at the start of the text box. I'd say the biggest headache was that the game used both a tilemap and vram bitmap block. In every other game I've looked at that had a bitmap block (rather than a tilemap print routine system), NEVER used a combination of bitmap and tilemap updates. It really sucked.

So yeah, DotB was much harder. I did the print routine hack for Cosmic Fantasy 1 for Dave. I don't remember if it was originally 12x12 or 16x16, but it was changed down to 8x12 IIRC. That was the first print routine where I made it dual character support (would display the original kana/kanji and the new slim font). That way Dave could read what any of the text he didn't translate, and just look for it manually in dumped CD RAM or the ISO. That was pretty easy print routine hack. But I also upgraded the game from CD 2.0 to SCD 3.0. And had hack the ISO to load my print routine font and code into the untouched SCD expanded ram - upon initial game boot. So no space issues. The text string routine was also modified to use both ascii and sjis. I added new control codes that when the text string routine came across these control codes, it would put the string read routine in ASCII single byte read mode. But, IIRC, I forced the script side to always call this control code. The EOL control code would reset this flag. That way SJIS was allowed to be read correct in case the game combined strings via token reference. It sounds complicated, but it was fairly easy and was done pretty quickly (done in a week, bug tested the next week IIRC).

I did the print routine for Lady Sword. Actually, I think all I did was figure out the character encoding, decompress it - change it - reinstert it back into the rom. I'm pretty sure the game already supported 8x16. I honestly don't remember if I modified the string read routine. I probably didn't.

Hmm. Makai Shada was easy. I just changed the 8x8 kana font to 8x8 roman.

Spriggan Mark 2: I completely broke the game's original print routine and handled everything after the string byte was fetched. I originally did 8x8 FWF font, then updated it to 8x8 VWF. This game also changed the code location of the print routine and free space too (where my new/hack code was residing). So I had to make two different hacks. One for the first 1/3rd, and one for the last 2/3rds. The text itself wouldn't fit back into all the stages, so I created a small 2 and 3 word dictionary compression based on the most common pairs and triplets of the script. The string read routine was also completely broken (all new code) for single byte ASCII and new control codes (old ones still worked though and was needed on a higher level of logic for the game).

I did print routines for other games, but I don't really remember what they are any more. Lost the code and such, but I didn't have translators for them - so it wasn't too big of a deal. Most games needed both new print routines AND new string routines. Every game I did had a new string routine IIRC.
Post Reply