jsvmpzl v1.1.3 反格式化检测

断点断到参数生成点

可以看到，由于函数和变量名使用了大量的大小写字符和下划线混淆手段，导致阅读十分困难

使用v_jstool对变量名进行压缩，放到vscode里面看看行数，一共943行，可以看到最下方传入值为虚拟机字节码

ast还原

标识符还原

查看代码，发现含有大量常量及js方法引用，通过标识符替换的方式对js代码进行了混淆

先不着急进行还原，先看一下开局的这些变量声明，其中这两个由函数生成的是什么呢？

跳转进N,并查看其内部e的引用，很明显N是个字符串还原函数，查看N的引用，发现只有在此处变量声明含有引用

直接运行代码，将所有常量声明还原回填

标识符安全回填

接下来我们需要判断一下这些变量是否未被修改，以及是否存在引用

// 遍历 AST
traverse(ast, {
    VariableDeclaration(path) {
        const { declarations } = path.node;
        for (const declaration of declarations) {
            const { id, init } = declaration;
            if (t.isIdentifier(id) && init) {
                if (!(t.isLiteral(init) || 
                      t.isIdentifier(init) || 
                      t.isArrayExpression(init) || 
                      t.isMemberExpression(init))) {
                    return;
                }
                if (t.isMemberExpression(init) && 
                    !t.isIdentifier(init.object) && 
                    !t.isIdentifier(init.property)) {
                    return;
                }
                const binding = path.scope.getBinding(id.name);
                if (binding && binding.constantViolations.length == 0) {
                    console.log(id.name, "=", generator(init).code, 
                                "引用数:", binding.references);
                }
            }
        }
        // 由于这个var是整个ast树中的第一个var,我们在获取到第一个节点的时候就可以直接stop了
        path.stop()
    }

emmmm说实话引用数量不是很多，甚至浪费我们写代码的时间

if (binding.references > 0) {
    for (const refPath of binding.referencePaths) {
        refPath.replaceWith(init);
    }
}

使用上述代码将节点替换，可以明显发现节点变灰了，我们可以在尾部对声明节点进行安全删除

花指令回填

可以看到，除了变量以外，还有很多将简单的成员调用变为函数调用的花指令

对于这种节点，我们需要动脑处理一下：

我们需要先找到这些函数在代码中的引用
可以通过建立实参和形参的映射表的思想将形参和实参一一对应起来
最后联系起来就是：找到符合特征的函数->提取return语句->建立映射表->实参形参对应替换->回填引用节点->删除函数声明

traverse(ast, {
    FunctionDeclaration(path) {
        const { id, body } = path.node;
        const binding = path.scope.getBinding(id.name);
        const statements = body.body;
        if (!(statements.length === 1 && t.isReturnStatement(statements[0]))) {
            return;
        }
        console.log(id.name, "引用数:", binding.references);
        const ret = statements[0].argument;
        if (t.isFunctionExpression(ret)) {
            return;
        }
        const params = path.node.params;
        for (const refPath of binding.referencePaths) {
            if (refPath.parentPath.isCallExpression({ callee: refPath.node })) {
                const args = refPath.parentPath.node.arguments;
                // 创建新的调用表达式
                let newExpression = t.cloneNode(ret);;
                params.forEach((param, index) => {
                    replaceIdentifier(newExpression, param.name, args[index]);
                });
                // 替换原有的函数调用
                const sourceCode = generator(refPath.parentPath.node).code;
                refPath.parentPath.replaceWith(newExpression);
                const destinationCode = generator(newExpression).code;
                console.log(sourceCode, "=>", destinationCode,"\n");
            }
        }
        path.remove();
    }
});

// 替换 Identifier 节点
function replaceIdentifier(node, name, replacement) {
    traverse(node, {
        noScope: true,
        Identifier(path) {
            if (path.node.name === name) {
                if (replacement === undefined) {
                    replacement = t.identifier("undefined");
                }
                path.replaceWith(t.cloneNode(replacement));
                path.skip()
            }
        }
    });
    return node;
}

可以看到节点经过我们这样一替换，瞬间干净不少

做到这了，想到是时候先验证一下替换有没有问题了，利用本地替换，将网站上的js替换掉，结果，居然报错……

尝试了一下将原js格式化后替换回去，一样报错，那看来就不是我们还原的问题，而是代码自带格式化检测功能。

一开始本来想的是直接从重点hook，看看是不是通过正则match检测，结果match的日志输出了100多页，输出的内容里愣是没找到位置，于是乎思考是不是toString方法上有关键输出内容，hook Function.prototype.toString 果然得到了下面的内容

简单的操作方式是，替换掉function输出字符串即可，而实际上，这段代码和eval也是息息相关的，我们需要在后期去掉检测