The Tragic Downfall of the Internet’s Art Gallery
Plus, more links to make you a little bit smarter today.
I Have 50GB of Music Samples. Time To Use Them.
This Ancient Story Was Drowned In Lies. The Truth is Much More Interesting.
One of the most interesting stories in early history is the relationship between Alexander and Diogenes. Unfortunately, most of its retellings — even ones told in the classroom — are misleading and largely inaccurate. So let me tell you here the truth behind one of my favorite Greek tales.
Uniform Pessimistic Risk and its Optimal Portfolio
The optimal allocation of assets has been widely discussed with the theoretical analysis of risk measures, and pessimism is one of the most attractive approaches beyond the conventional optimal portfolio model. The α-risk plays a crucial role in deriving a broad class of pessimistic optimal portfolios. However, estimating an optimal portfolio assessed by a pessimistic risk is still challenging due to the absence of a computationally tractable model. In this study, we propose an integral of α-risk called the uniform pessimistic risk and the computational algorithm to obtain an optimal portfolio based on the risk. Further, we investigate the theoretical properties of the proposed risk in view of three different approaches: multiple quantile regression, the proper scoring rule, and distributionally robust optimization. Real data analysis of three stock datasets (S&P500, CSI500, KOSPI200) demonstrates the usefulness of the proposed risk and portfolio model.
Red-Teaming for Generative AI: Silver Bullet or Security Theater?
In response to rising concerns surrounding the safety, security, and trustworthiness of Generative AI (GenAI) models, practitioners and regulators alike have pointed to AI redteaming as a key component of their strategies for identifying and mitigating these risks. However, despite AI red-teaming’s central role in policy discussions and corporate messaging, significant questions remain about what precisely it means, what role it can play in regulation, and how it relates to conventional red-teaming practices as originally conceived in the field of cybersecurity. In this work, we identify recent cases of red-teaming activities in the AI industry and conduct an extensive survey of relevant research literature to characterize the scope, structure, and criteria for AI red-teaming practices. Our analysis reveals that prior methods and practices of AI red-teaming diverge along several axes, including the purpose of the activity (which is often vague), the artifact under evaluation, the setting in which the activity is conducted (e.g., actors, resources, and methods), and the resulting decisions it informs (e.g., reporting, disclosure, and mitigation). In light of our findings, we argue that while red-teaming may be a valuable big-tent idea for characterizing GenAI harm mitigations, and that industry may effectively apply red-teaming and other strategies behind closed doors to safeguard AI, gestures towards red-teaming (based on public definitions) as a panacea for every possible risk verge on security theater. To move toward a more robust toolbox of evaluations for generative AI, we synthesize our recommendations into a question bank meant to guide and scaffold future AI red-teaming practices.
The Tragic Downfall of the Internet’s Art Gallery
Once a vibrant platform for artists, DeviantArt is now buckling under the weight of bots and greed—and spurning the creative community that made it great.
Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Large vision-language models (VLMs) fine-tuned on specialized visual instructionfollowing data have exhibited impressive language reasoning capabilities across various scenarios. However, this fine-tuning paradigm may not be able to efficiently learn optimal decision-making agents in multi-step goal-directed tasks from interactive environments. To address this challenge, we propose an algorithmic framework that fine-tunes VLMs with reinforcement learning (RL). Specifically, our framework provides a task description and then prompts the VLM to generate chain-of-thought (CoT) reasoning, enabling the VLM to efficiently explore intermediate reasoning steps that lead to the final text-based action. Next, the open-ended text output is parsed into an executable action to interact with the environment to obtain goal-directed task rewards. Finally, our framework uses these task rewards to fine-tune the entire VLM with RL. Empirically, we demonstrate that our proposed framework enhances the decision-making capabilities of VLM agents across various tasks, enabling 7b models to outperform commercial models such as GPT4-V or Gemini. Furthermore, we find that CoT reasoning is a crucial component for performance improvement, as removing the CoT reasoning results in a significant decrease in the overall performance of our method.
Is the annualized compounded return of Medallion over 35%?
It is a challenge to estimate fund performance by compounded returns. Arguably, it is incorrect to use yearly returns directly for compounding, with reported annualized return of above 60% for Medallion for the 31 years up to 2018. We propose an estimation based on fund sizes and trading profits and obtain a compounded return of 32.6% before fees with a 3% financing rate. Alternatively, we suggest using the manager’s wealth as a proxy and arriving at a compounded growth rate of 25.6% for Simons for the 33 years up to 2020. We conclude that the annualized compounded return of Medallion before fees is probably under 35%. Our findings have implications for how to compute fund performance correctly.