Compete with Team Europe 2023 rev/PasteGPT Side Hustle

ai_model.zip

Digging through the source code of the linked website lead to a small piece of commented out python code. The code downloaded a file from http://jeopardy.teameurope.space:10007/ai_model.zip, unzipped the file, and loaded the model with MLflow.

The file does indeed contain a MLflow model directory. None of the metadata files are very informative, so we look into the python_mode.pkl file. The file extension suggests it’s a python pickle file.

>>> import pickle
>>> l = pickle.load(open('python_model.pkl', 'rb'))
>>> type(l)
mlflow.pyfunc.model._FunctionPythonModel
>>> dir(l)
['__class__',
 '__delattr__',
 ...
 '_get_type_hints',
 'func',
 'hints',
 'load_context',
 'predict',
 'signature']

Considering we’re supposed to be working with a ML model, predict sounds interesting.

>>> import dis
>>> dis.dis(l.predict)
117           0 RESUME                   0

118           2 LOAD_FAST                0 (self)
              4 LOAD_METHOD              0 (func)
             26 LOAD_FAST                2 (model_input)
             28 PRECALL                  1
             32 CALL                     1
             42 RETURN_VALUE

Looks like the real meat will be inside func. func in turn defines a check function that does most of the heavy lifting. The logic of func and check can be gleaned from reading the disassembly and referencing the python manual.

The function first checks whether the prompt contained words like “destroy humanity” or “flag” and returns joking answers. Next it runs the prompt through a loop that computes a checksum with xoring against constants and arithmetic operations. If the checksum result is non 0, the function returns a randomly chosen hardcoded string to return. Otherwise, a sha256 hash of the input is computed and used as a key to AES decrypt the flag.

A reimplementation of the custom checksum function looks something like this.

import random

a = (244, ..., 50)
seed = 4201337420
random.seed(seed)

def check(l):
    s = 0
    for n in a:
        c = random.randint(2, 4)
        v = random.choice(l)
        [v := eval('v {1} {0}'.format(random.choice(l), random.choice(('+', '-', '*', '^', '&', '|')))) for _ in range(c)]
        if (v ^ n) & (255 - random.randint(0, 255)):
            s += 1
    return s

The only sane way so is to encode the constraints for a SMT solver. The original solve script returned recognizable text, but with non ascii bytes thrown in. This turned out to be a bug in the challenge. I fixed the recognizable text and printed out all solutions that satisfied the constraints (luckily only 15). And sure enough, after passing all 15 into the original func function, one returned the flag.

import operator
import random

from z3 import *

l = [BitVec(f'x_{i}', 8) for i in range(188)]
a = (244, .., 50)
random.seed(4201337420)
solver = Solver()

pre_magic = 'According to all known laws of aviation, there is no way a bee should be able to fly.'
suf_magic = "(don't mind me, just increasing entropy)"
for i in range(len(pre_magic)):
    solver.add(l[i] == ord(pre_magic[i]))
for i in range(len(suf_magic)):
    solver.add(l[-i-1] == ord(suf_magic[-i-1]))

def check():
    for n in a:
        c = random.randint(2, 4)
        v = random.choice(l)
        for _ in range(c):
            o = random.choice(l)
            z = random.choice((operator.add, operator.sub, operator.mul, operator.xor, operator.and_, operator.or_))
            v = z(v, o)
        solver.add((v ^ n) & (255 - random.randint(0, 255)) == 0)

def good(x):
    solver.add(Or(x == 32, x == 39, x == 40, x == 41, x == 44, x == 46, And(x >= 48, x < 58), And(x >= 65, x < 91), And(x >= 97, x < 123)))

for i in l:
    good(i)

check()
a = []
while solver.check() == sat:
    model = solver.model()
    s = bytes(model[i].as_long() for i in l)
    a.append(s.decode())
    solver.add(Or(*(i != j for i, j in zip(l, s))))

print(a)

consnop