corCTF 2023 rev/impossible-maze

impossible_maze

The challenge binary implements a terminal game, where the player has to navigate a series of “mazes”. The player figure is moves by pressing the wasd keys, the rendering mode for the camera can be toggled with the c key and q exits the game. The mazes however feature fake tiles, invisible tiles, invisible switches, and other mechanics that make them impossible to complete only based on the information that’s rendered on screen.

The main function starts with some initialization and then enters directly into the main game loop. First thing we probably want to find out, is where input is handled.

...
  while ( 1 )
  {
    v3 = sub_5584708AC690(*(__int64 *)&qword_5584708ECC70);
    if ( v3 != -1 )
      sub_5584708A9520(v3);
...

If we break at the beginning of the loop, we can observe that pressing a key causes the branch that calls sub_5584708A9520 to be taken. In this case, sub_5584708AC690 appears to be reading input into v3 and passing it into sub_5584708A9520. sub_5584708A9520 (handle_input from now on) starts with a switch statement on the argument matching against ascii values of wasd, which confirms our suspision.

...
  switch ( a1 )
  {
    case 'a':
      v2 = 1.0;
      v1 = 0.0;
      goto LABEL_3;
    case 'c':
      dword_5584708EC87C ^= 1u;
      return;
    case 'd':
      v2 = -1.0;
      v1 = 0.0;
      goto LABEL_3;
    case 'q':
      dword_5584708EC86C = 1;
      return;
    case 'r':
      sub_5584708A9460();
      return;
    case 's':
      v1 = -1.0;
      v2 = 0.0;
      goto LABEL_3;
    case 'w':
      v1 = 1.0;
      v2 = 0.0;
...

v1 and v2 look like coordinate offsets. This is important, beause by following those we can probably make our way to absolute position of the player and the game tilemap. This turns out to be exactly what happens. The old absolute values are read and combined with the offsets into v12 and v13 and are then used to index into a tilemap of 8x8 ints (v7 always contains 8).

...
          v16 = *(_BYTE *)(v9 + (int)(v13 + v12 * v7));
          switch ( v16 )
...

We’ve now found the address of the tilemap in memory, so we can dump out the bytes and compare to what we see on screen.

In [1]: b = bytes.fromhex('00000300000000040202020000000002020000000000000202000000000000020202020202000002000000000202020200000000000000000000000000000000')

In [2]: [[hex(b[i+j])[2:] for j in range(8)] for i in range(0, len(b), 8)]
Out[2]: 
[['0', '0', '3', '0', '0', '0', '0', '4'],
 ['2', '2', '2', '0', '0', '0', '0', '2'],
 ['2', '0', '0', '0', '0', '0', '0', '2'],
 ['2', '0', '0', '0', '0', '0', '0', '2'],
 ['2', '2', '2', '2', '2', '0', '0', '2'],
 ['0', '0', '0', '0', '2', '2', '2', '2'],
 ['0', '0', '0', '0', '0', '0', '0', '0'],
 ['0', '0', '0', '0', '0', '0', '0', '0']]

This correlates perfectly with what we see on screen, modulo symmetry. At this point there were 2 possible options. At this point there it would have been possible to reverse the handling code for each type of tile and figure out it’s behavior. But due to the small number of tile types, it was more time efficient to to continue with trial and error. There were 11 levels in total. Each time I figured out how to complete a level, I recorded the correct sequence of keystrokes. A simple python script can then read input sequences and complete the levels automatically.

wddwwwaaaawaaasssss
wwwwwwwdddddddsssssss
aaaaaaawwwwwwwddsdds
sssssdddwwwwwwwaaa
dddsssssss
aadwwwwswwaaaaaaww
sdddddddssssaaaaaaassddddddd
wwaaaaaaawwdddddddww
asaaasaaassdddssdddd
aaaaaaawwwwwdddsssddddwwww
aaaaaaassdddddddssaaaaaaassddddddd

import os
import time


time.sleep(3)

with open('moves.txt') as f:
    for line in f:
        for move in line[:-1]:
            os.system(f'xdotool key {move}')
            time.sleep(0.2)
        time.sleep(0.2)

After the final level is completed, the flag is decrypted and rendered on screen.

consnop