[Translation] Nazo no Masquerade

Post by **MooZ** » Sun Oct 02, 2011 12:51 pm

Hi there!

I just took a quick look at this game and here are the cool stuffs. The games has 2 fonts, an 8x8 one with numbers, symbols, upper case and 8x8 kanji (I think), and a 16x16 one.

The text for the introduction starts at $4dc49. The date 1921 is stored in plain ascii. So I brutally wrote some stupid text in upper case and voilà!

I identified 2 special chars : $fc for newline $fb which indicates that the next byte is one of the 16x16 symbol.
About the code: I made a quick read and it seems that it comes in 2 part. First the text is read from ROM and put in an array in BSS ($9bb6 to 9bc5).

Code: Select all

9bb6: cly
      clx
9bb8: lda ($06),y
      sta $2220,x
      beq $9bc5
      inx
      jsr $9c74
      bra $9bb8
      rts

The drawing parts happens around $9d84 ($2a $2b are set at $9bd5).

Code: Select all

9d84: lda $10
      sta $0002
      lda $11
      sta $0003
      lda ($2a),y
      cmp #$de
      beq $9d98
      cmp #$df
      bne $9da9
9d98: iny

It seems that all the strings are contiguous. Let's call this area "the string space". So this space is divided in blocs. Each bloc seems to start with a 3 bytes header. The introduction one ($4dc46) starts with $01 $fe $fb. If you look a little bit farther in the ROM you'll find $02 $fe $fb and so one. So there must be some place telling the game which bloc to use. If we put a read breakpoint on $4dc46 (use *8dc46 on Mednafen), we end up at $9b13.

Code: Select all

9b13: lda ($06),y
      cmp #$fe
      beq $9b20
      sta $0e
9b1b: jsr $9c74
      bra $9b13
9b20: lda $0e
      cmp $33dd
      bne $9b1b
      ; ...

In order to make things short, this codes scans the area pointed by $06 until it finds #$fe. Remember our header ($01 $fe $fb). So it first read $01. Stores it in $0e and jump to next byte (that's what the routine located at $9c74 do). Next #$fe is read. We are out of the loop. Then comes the interesting part. It checks the value of $0e against $33dd (it's in RAM). If they are not equal, it will resume the bloc scan until a matching header is found. The next logical task is to put a write breakpoint at $33dd. And we may consider ourselves lucky because this is what we get when we restarts the game:

Code: Select all

ae55: lda #$01
      sta $33dd
      lda #$05
      clx
      jsr $9f1b
      ; ...

If you keep the write breakpoint at $33dd, skip the intro and start a new game, you'll get:

Code: Select all

8012: ldx #$02
      stx $33dd
      jsr $80f6
      ; ...

So if you modified the text for the 2nd string bloc, you may see something like this:

The string table (or what's close to it) is getting closer!

Next items on the todo list:

String tables.
Font data.
Extract script.

Post by **MooZ** » Mon Oct 03, 2011 8:54 pm

8x8 font table (hope it's readable by everyone).

Code: Select all

  !                       - . /
0 1 2 3 4 5 6 7 8 9           ?
© A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z
            ? ぁ ぃ ぅ ぇ ぉ ゃ ゅ ょ っ
  あ い う え お か き く け こ さ し す せ そ 
た ち つ て と な に ぬ ね の は ひ ふ へ ほ ま
み む め も や ゆ よ ら り る れ ろ わ ん
  ｡ ｢ ｣ ､ ･ ｦ ｧ ｨ ｩ ｪ ｫ ｬ ｭ ｮ ｯ
ｰ ｱ ｲ ｳ ｴ ｵ ｶ ｷ ｸ ｹ ｺ ｻ ｼ ｽ ｾ ｿ
ﾀ ﾁ ﾂ ﾃ ﾄ ﾅ ﾆ ﾇ ﾈ ﾉ ﾊ ﾋ ﾌ ﾍ ﾎ ﾏ
ﾐ ﾑ ﾒ ﾓ ﾔ ﾕ ﾖ ﾗ ﾘ ﾙ ﾚ ﾛ ﾜ ﾝ ﾞ ﾟ

It looks like the full-width roman characters and half-width katakana with some hiragana between them. I was unable to decrypt/identify the first "smaller" hiragana (the one under V in the image). I'll be glad if someone figures out what the 16x16 font are. Obviously they are kanji. But which part of shift-JIS subset are they?

[edit] I found this kanji. I think I'll have to look at the string parsing and hope it'll reveal how to identify the associated kanji.
This search page is cool. I think I found the first kanji.

1st kanji "table":

Code: Select all

  神 光 太 郎 ? つ ?
? 影 次 冴 子 ? 雪 江
夫 絹 代 一 ? ? ? ?
? 源 ? 今 田 ? ? 清
? ? ? ? ? ? ? ろ
? ? ? ? ? ? ? ?
? ? ? ? ? 部 ? ?
? ? ? ? ? ? ? 大

tomaitheous · Post by **tomaitheous** » Tue Oct 04, 2011 2:33 am

In order to make things short, this codes scans the area pointed by $06 until it finds #$fe. Remember our header ($01 $fe $fb). So it first read $01. Stores it in $0e and jump to next byte (that's what the routine located at $9c74 do). Next #$fe is read. We are out of the loop. Then comes the interesting part. It checks the value of $0e against $33dd (it's in RAM). If they are not equal, it will resume the bloc scan until a matching header is found.

Oh, so it's doing a 'seek' through the block of strings then?

Post by **MooZ** » Thu Oct 06, 2011 12:03 pm

Yes, but for what I saw (only the first 2 blocs) $06 is initialized so that it points directly to the bloc header.

Post by **MooZ** » Sun Oct 09, 2011 4:46 pm

It seems that graphic data are encoded using a simple RLE scheme. The lowest 4 bits holds the palette index for the current pixel and the highest 4 bits, the count. Another interesting fact is that graphics are not tile encoded. But they are stored as standard bitmap ie line per line.
Here's a small C program that will extract the roman, katakana and hiragana font set. It seems that it has been encoded as a 128x128 bitmap. The output is encoded in PPM ascii.
Here's the result:

[edit] Fixed the whole stuff! I omitted the first 2 bytes of every data "section" because I knew that there were all 128x128 bitmaps. Just check the two bytes before every offset in the source and you'll see they are always equal to $10 $10. I know that's cheating

The table for graphic data is around 0x442BA (rom file offset). For example you'll find 13 00 60 18 there. The first byte is the bank where the data is (13). The second byte is ignored. Then comes an offset $1860. In the code ($9576), you'll see that #$40 is added to the high byte (stored in $07). And if you translate the pointer, you have $1860 + ($13 << $0d) + $200 = 27a60. The rom offset we found previously for the roman/hiragana/katakana font! We can now relocate data if we want. Cool isn't it?

Post by **MooZ** » Wed Oct 12, 2011 8:08 pm

The people on the Romhack.org forum are awesome!
Fist, TiDragon fixed the hiragana/katakana.

Code: Select all

  !                       - . /
0 1 2 3 4 5 6 7 8 9           ?
© A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z
            を ぁ ぃ ぅ ぇ ぉ ゃ ゅ ょ っ
  あ い う え お か き く け こ さ し す せ そ
た ち つ て と な に ぬ ね の は ひ ふ へ ほ ま
み む め も や ゆ よ ら り る れ ろ わ ん
  ｡ ｢ ｣ ､ ･ ｦ ｧ ｨ ｩ ｪ ｫ ｬ ｭ ｮ ｯ
ｰ ｱ ｲ ｳ ｴ ｵ ｶ ｷ ｸ ｹ ｺ ｻ ｼ ｽ ｾ ｿ
ﾀ ﾁ ﾂ ﾃ ﾄ ﾅ ﾆ ﾇ ﾈ ﾉ ﾊ ﾋ ﾌ ﾍ ﾎ ﾏ
ﾐ ﾑ ﾒ ﾓ ﾔ ﾕ ﾖ ﾗ ﾘ ﾙ ﾚ ﾛ ﾜ ﾝ ﾞ ﾟ

And Spring indentied the kanjis:

Code: Select all

　神光太郎りつサ
鍵影次冴子彦雪江
夫絹代一麗享平竜
野源蔵今田たみ清
水春おヘいじころ
んぼあけちきクリ
スティ婦氷部屋書
斎執事家庭教師大
広間浴室客食堂台
所倉庫小生番駐車
場中女物置１２階
件殺人伝説洋館現
日記遺言状ハンマ
手袋ワ土地コパト
ダイヤ指輪借用薬
ビナフ探偵社僧長
円陣龍之介何貿易
商仕族産相続死父
親母息娘金仲結婚
凶器病気目犯発見
謎仮装舞踏会段狼
沼洗面的本敷調山
窓机自動嫁悪下夜
朝情報愛毒麻頭約
棚入口財因者男聞
兄撃弟迷路真実話
主妻鏡捜査証拠品
争戦関係密容疑～
索機供信罪夢使年
思名前性格業誰命
私俺刃体秘時紙道
尋問選　　読

The script can now be dumped

Charles MacDonald · Post by **Charles MacDonald** » Thu Oct 13, 2011 3:56 am

Awesome! Sounds like serious progress.

Will all the tools and stuff for this project be released like they were with Marchen Maze?

Post by **MooZ** » Thu Oct 13, 2011 7:22 am

Yes.

edit: Here's the first version of the script extractor and the script itself:

edit deluxe: I updated the archives and the script. The 8 bit half katakanas are now replaced by their 16 bit full size counter parts. You can still find the older version here:

Code
Script

Post by **MooZ** » Sun Oct 23, 2011 8:33 pm

I just finished the image encoder. At first I obtained smaller compressed images compared to the one extracted from the rom. It appeared that the rle compression was not applied on the complete binary data, but on each line consecutively. I don't know if I'm clear. But in pseudocode, I was doing this:

Code: Select all

encode_rle(data, width*height);

While in Nazo no Masquerade, they did:

Code: Select all

for(y=0; y<height; y++)
{
    encode_rle(data + (y*width), width);
}

Anyway! Here is the [source code]. Here's how you must call the binary from the command line:

Code: Select all

img_encode file.pcx out

.
The input PCX image must have 8 bits per pixel and contains its palette must contains at most 16 colors. It will generates 2 files :

out.dat containing the rle encoded image
out.pal which is a text file containing the r,g,b triplets of the palette

Post by **MooZ** » Sun Dec 04, 2011 5:46 pm

Here is the first version of the font.

PC Engine dev forum

[Translation] Nazo no Masquerade

[Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade

Re: [Translation] Nazo no Masquerade