Saturday, June 28, 2025
No Result
View All Result
Coins League
  • Home
  • Bitcoin
  • Crypto Updates
    • Crypto Updates
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • Blockchain
  • NFT
  • DeFi
  • Metaverse
  • Web3
  • Scam Alert
  • Regulations
  • Analysis
Marketcap
  • Home
  • Bitcoin
  • Crypto Updates
    • Crypto Updates
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • Blockchain
  • NFT
  • DeFi
  • Metaverse
  • Web3
  • Scam Alert
  • Regulations
  • Analysis
No Result
View All Result
Coins League
No Result
View All Result

OpenAI GPT 4o ranked as best AI model for writing Solidity smart contract code by IQ

October 22, 2024
in Web3
Reading Time: 3 mins read
0 0
A A
0
Home Web3
Share on FacebookShare on TwitterShare on E Mail


SolidityBench by IQ has launched as the primary leaderboard to guage LLMs in Solidity code era. Out there on Hugging Face, it introduces two progressive benchmarks, NaïveJudge and HumanEval for Solidity, designed to evaluate and rank the proficiency of AI fashions in producing good contract code.

Developed by IQ’s BrainDAO as a part of its forthcoming IQ Code suite, SolidityBench serves to refine their very own EVMind LLMs and examine them in opposition to generalist and community-created fashions. IQ Code goals to supply AI fashions tailor-made for producing and auditing good contract code, addressing the rising want for safe and environment friendly blockchain functions.

As IQ instructed CryptoSlate, NaïveJudge affords a novel method by tasking LLMs with implementing good contracts primarily based on detailed specs derived from audited OpenZeppelin contracts. These contracts present a gold commonplace for correctness and effectivity. The generated code is evaluated in opposition to a reference implementation utilizing standards resembling practical completeness, adherence to Solidity greatest practices and safety requirements, and optimization effectivity.

The analysis course of leverages superior LLMs, together with completely different variations of OpenAI’s GPT-4 and Claude 3.5 Sonnet as neutral code reviewers. They assess the code primarily based on rigorous standards, together with implementing all key functionalities, dealing with edge instances, error administration, correct syntax utilization, and total code construction and maintainability.

Optimization issues resembling fuel effectivity and storage administration are additionally evaluated. Scores vary from 0 to 100, offering a complete evaluation throughout performance, safety, and effectivity, mirroring the complexities {of professional} good contract growth.

Which AI fashions are greatest for solidity good contract growth?

Benchmarking outcomes confirmed that OpenAI’s GPT-4o mannequin achieved the best total rating of 80.05, with a NaïveJudge rating of 72.18 and HumanEval for Solidity cross charges of 80% at cross@1 and 92% at cross@3.

Apparently, newer reasoning fashions like OpenAI’s o1-preview and o1-mini have been overwhelmed to the highest spot, scoring 77.61 and 75.08, respectively. Fashions from Anthropic and XAI, together with Claude 3.5 Sonnet and grok-2, demonstrated aggressive efficiency with total scores hovering round 74. Nvidia’s Llama-3.1-Nemotron-70B scored lowest within the high 10 at 52.54.

SolidityBench scores for LLMs (Hugging Face)
SolidityBench scores for LLMs (Hugging Face)

Per IQ, HumanEval for Solidity adapts OpenAI’s authentic HumanEval benchmark from Python to Solidity, encompassing 25 duties of various issue. Every process contains corresponding assessments suitable with Hardhat, a well-liked Ethereum growth surroundings, facilitating correct compilation and testing of generated code. The analysis metrics, cross@1 and cross@3, measure the mannequin’s success on preliminary makes an attempt and over a number of tries, providing insights into each precision and problem-solving capabilities.

Objectives of using AI fashions in good contract growth

By introducing these benchmarks, SolidityBench seeks to advance AI-assisted good contract growth. It encourages the creation of extra refined and dependable AI fashions whereas offering builders and researchers with helpful insights into AI’s present capabilities and limitations in Solidity growth.

The benchmarking toolkit goals to advance IQ Code’s EVMind LLMs and likewise units new requirements for AI-assisted good contract growth throughout the blockchain ecosystem. The initiative hopes to handle a important want within the business, the place the demand for safe and environment friendly good contracts continues to develop.

Builders, researchers, and AI lovers are invited to discover and contribute to SolidityBench, which goals to drive the continual refinement of AI fashions, promote greatest practices, and advance decentralized functions.

Go to the SolidityBench leaderboard on Hugging Face to study extra and start benchmarking Solidity era fashions.

🤖 High AI Crypto Belongings

View AllMentioned on this article



Source link

Tags: CodeContractGPTModelOpenAIrankedSmartSolidityWriting
Previous Post

Google’s AI Podcast Creator Goes Viral: A New Era of Content

Next Post

PayPal’s Move to Zero Fees for International Crypto Transfers

Related Posts

Trump Blames Biden for Banks Blocking Crypto: ‘There Is a Lot of Debanking’
Web3

Trump Blames Biden for Banks Blocking Crypto: ‘There Is a Lot of Debanking’

June 28, 2025
Myriad Moves: Will Trump Drop Another F-Bomb? Plus Predictions on Ethereum and Wimbledon
Web3

Myriad Moves: Will Trump Drop Another F-Bomb? Plus Predictions on Ethereum and Wimbledon

June 26, 2025
Can AI Crack the Cat Code? Baidu Thinks So
Web3

Can AI Crack the Cat Code? Baidu Thinks So

June 25, 2025
XRP Ledger unveils update to challenge Ethereum’s dominance
Web3

XRP Ledger unveils update to challenge Ethereum’s dominance

June 25, 2025
Crypto Scam Markets Thrive Again After Telegram’s Cleanup Attempt: Report
Web3

Crypto Scam Markets Thrive Again After Telegram’s Cleanup Attempt: Report

June 24, 2025
This Week in Crypto Games: B3’s Self-Destruct PC, Avalanche Battle Pass
Web3

This Week in Crypto Games: B3’s Self-Destruct PC, Avalanche Battle Pass

June 22, 2025
Next Post
PayPal’s Move to Zero Fees for International Crypto Transfers

PayPal's Move to Zero Fees for International Crypto Transfers

A Billion-Dollar Purchase: Stripe Reportedly Buys Bridge

A Billion-Dollar Purchase: Stripe Reportedly Buys Bridge

Top Analyst Predicts Rallies for Dogecoin and FLOKI, Says Nothing Looks As Bullish as DOGE

Top Analyst Predicts Rallies for Dogecoin and FLOKI, Says Nothing Looks As Bullish as DOGE

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Twitter Instagram LinkedIn RSS Telegram
Coins League

Find the latest Bitcoin, Ethereum, blockchain, crypto, Business, Fintech News, interviews, and price analysis at Coins League

CATEGORIES

  • Altcoin
  • Analysis
  • Bitcoin
  • Blockchain
  • Crypto Exchanges
  • Crypto Updates
  • DeFi
  • Ethereum
  • Metaverse
  • NFT
  • Regulations
  • Scam Alert
  • Uncategorized
  • Web3

SITEMAP

  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us

Copyright © 2023 Coins League.
Coins League is not responsible for the content of external sites.

No Result
View All Result
  • Home
  • Bitcoin
  • Crypto Updates
    • Crypto Updates
    • Altcoin
    • Ethereum
    • Crypto Exchanges
  • Blockchain
  • NFT
  • DeFi
  • Metaverse
  • Web3
  • Scam Alert
  • Regulations
  • Analysis

Copyright © 2023 Coins League.
Coins League is not responsible for the content of external sites.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In