Category Archives: Algorithm

Why Does Reinforcement Learning Outperforms Offline Fine-Tuning? Generation-Verification Gap Explained

In the ever-evolving world of artificial intelligence, fine-tuning models to achieve optimal performance is a critical endeavor. We often find ourselves choosing between different methodologies, particularly when it comes to refining large language models (LLMs) or complex AI systems. Two primary approaches stand out: reinforcement learning (RL) and offline fine-tuning methods like Direct Preference Optimization…

Read More

Leetcode 2429 – Minimize XOR

Source: https://leetcode.com/problems/minimize-xor/ Problem statement Given two positive integers num1 and num2, find the integer x such that: x has the same number of set bits as num2, and The value x XOR num1 is minimal. Note that XOR is the bitwise XOR operation. Return the integer x. The test cases are generated such that x…

Read More

Leetcode 39 – Combination Sum

Source: https://leetcode.com/problems/combination-sum/ Problem statement Given an array of distinct integers candidates and a target integer target, return a list of all unique combinations of candidates where the chosen numbers sum to target. You may return the combinations in any order. The same number may be chosen from candidates an unlimited number of times. Two combinations…

Read More

Leetcode 30 – Substring with concatenation of all words

Source: https://leetcode.com/problems/substring-with-concatenation-of-all-words/ Problem statement You are given a string s and an array of strings words. All the strings of words are of the same length. A concatenated substring in s is a substring that contains all the strings of any permutation of words concatenated. For example, if words = [“ab”,”cd”,”ef”], then “abcdef”, “abefcd”, “cdabef”,…

Read More