Rust 1.88 adds support for naked functions | InfoWorld

Technology insight for the enterprise

Rust 1.88 adds support for naked functions 26 Jun 2025, 8:32 pm

Rust 1.88 has been released, featuring support for writing naked functions, as well as backing for Boolean literals.

The update of the popular fast and safe programming language was announced June 26 by the Rust Release Team. Current Rust users can get Rust 1.88 by running rustup update stable.

Rust 1.88 adds language support for writing naked functions with no compiler-generated epilogue and prologue. This allows full control over the generated assembly for a particular function. The Rust team said this was a more ergonomic alternative to defining functions in a global_asm! block. A naked function is marked with the #[unsafe(naked)]attribute; its body consists of a single naked_asm! call. The team showed an example where a handwritten assembly block defines the entire function body. For naked functions, the compiler does not add any special handling for return values or arguments, unlike non-naked functions. Naked functions are used in low-level settings such as Rust’s compiler-builtins, operating systems, and embedded applications.

With Rust 1.88, the cfg predicate language now backs Boolean literals, true and false, acting as a configuration that is always enabled or disabled. This works in Rust conditional compilation with cfgand cfg_attr attributes, in the built-in cfg! macro, and in Cargo [target] tables in both configuration and manifests, the Rust team said. Previously, empty predicate lists could be used for unconditional configuration, like cfg(all()) for enabled and cfg(any()) for disabled, but cfg(true) and cfg(false) offer a more direct way to say what is meant, the team said.

Also with Rust 1.88, let statements now can be chained (&&) inside if and while conditions, and even intermingle with Boolean expressions. Thusly, there is less distinction between if and if let and between while and while let. The patterns inside the let sub-expressions can be refutable or irrefutable, and bindings are usable in later parts of the chain as well as the body, according to the Rust team.

Finally, the Cargo package manager now automatically runs garbage collection on the cache in its home directory. In explaining this change, the Rust team said that when building, Cargo downloads and caches crates needed as dependencies. Historically, these downloaded files were never cleaned up, leading to an unbounded amount of disk usage in Cargo’s home directory. With Rust 1.88, Cargo introduces a garbage collection mechanism to automatically clean up old files.

Rust 1.88 follows last month’s Rust 1.87 release, which brought accommodations for anonymous pipes and architecture intrinsics.

(image/jpeg; 1.01 MB)

Teradata aims to simplify on-premises AI for data scientists with AI Factory 26 Jun 2025, 11:10 am

Running AI workloads on premises has often meant overloading shared infrastructure or relying on complex hybrid setups. In an effort to reduce this friction, Teradata has bundled some of its existing tools into a new AI Factory to help data scientists manage the full AI lifecycle on premises without disrupting core systems or moving to the cloud.

“A lot of IT departments are already fully using their Teradata system for mission-critical workstreams and really don’t want data scientists doing AI exploratory work on that same system,” said Dan Spurling, SVP of product management at Teradata.

Until now, he said, to keep the two apart enterprises would have to use a hybrid solution of Teradata’s existing on-prem hardware and VantageCloud. “But AI Factory simplifies this process,” he said, separating the infrastructures where needed and making it possible to do all of the AI workstream, or at least the most expensive part of it, on premises.

This can give organizations “predicable costs and more control over their data sovereignty,” Spurling said.

AI Factory, which wasn’t previously available on-premises in a single, integrated, and ready-to-run stack, combines the data analytics software provider’s database engine and AI Workbench (consisting of ClearScape Analytics, ModelOps, JupyterHub, AI and ML In-Engine Analytics, Enterprise Vector Store, and support for the Open Table format) with AI microservices such as RAG, retrieval embedding, reranking, and guardrails.

Together, these tools gives enterprises a way to scale AI with an out-of-the-box packaged solution — with governance, performance, and control baked in, said Michael Ni, principal analyst at Constellation Research.

“What stands out is how Teradata integrated ModelOps, vector search, and LLM pipelines into a turnkey platform providing traceability and compliance from experimentation to production,” Ni added.

AI Factory is also developer- friendly, said Moor Insights and Strategy principal analyst Robert Kramer.

“The built-in AI Workbench includes tools such as JupyterHub and ModelOps, which are familiar to data teams and help keep the development process moving. That’s important for customers who want to keep AI projects from getting stuck in setup or compliance bottlenecks,” Kramer said. The predictable pricing model should help customers avoid surprise costs, he said.

Regulatory pressure

Enterprises’ desire to move AI activities on-premises isn’t just driven by questions of cost or control: For some businesses, it’s also about regulatory norms and scrutiny.

“Between stricter data privacy rules, growing concerns over data sovereignty, and unpredictable cloud costs, many enterprises are rethinking what should stay in the cloud and what’s better off staying in-house,” Kramer said. “This is especially true for industries like healthcare, finance, and government, where regulations are tight and data sensitivity is high.”

Other vendors including IBM, Dell, and HPE are looking to help enterprises build AI on-premises, but Kramer said Teradata has an edge, especially with customers who are already using it for data management and want to bring AI into the mix without standing up a bunch of new systems or re-architecting what they already have.

“While others might have gotten there first, Teradata’s offering is built around real integration with enterprise data operations, which can save time and reduce complexity for its existing customers,” he said.

Integration with Nvidia

For AI Factory, Teradata has partnered with Nvidia by using its Nvidia Enterprise AI Factory Validated Design, a full-stack featuring its Blackwell-accelerated computing systems, AI Enterprise software, and networking hardware.


The Nvidia GPUs, though, in this case, have to be owned by the enterprise deploying AI Factory on-premise, Teradata said.

As Constellation Research’s Ni put it, “Nvidia brings the AI muscle, Teradata brings enterprise-grade data and analytics platform.”

Having their Nvidia chips on premises, Kramer said, enables enterprises to run heavier AI workloads such as training or inferencing models without relying on cloud-based GPUs, which can lead to high or unpredictable costs. “To avoid that, what Teradata does is connects to those GPUs using its AI Microservices layer, which helps customers get more out of what they’ve already invested in,” he said.

AI Factory is now generally available.

(image/jpeg; 2.54 MB)

Designing a metadata-driven ETL framework with Azure ADF: An architectural perspective 26 Jun 2025, 7:12 am

In today’s data-driven landscape, integrating diverse data sources into a cohesive system is a complex challenge. As an architect, I set out to design a solution that could seamlessly connect on-premises databases, cloud applications and file systems to a centralized data warehouse. Traditional ETL (extract, transform, load) processes often felt rigid and inefficient, struggling to keep pace with the rapid evolution of data ecosystems. My vision was to create an architecture that not only scaled effortlessly but also adapted dynamically to new requirements without constant manual rework. 

The result of this vision is a metadata-driven ETL framework built on Azure Data Factory (ADF). By leveraging metadata to define and drive ETL processes, the system offers unparalleled flexibility and efficiency. In this article, I’ll share the thought process behind this design, the key architectural decisions I made and how I addressed the challenges that arose during its development. 

Recognizing the need for a new approach 

The proliferation of data sources — ranging from relational databases like SQL Server and Oracle to SaaS platforms like Salesforce and file-based systems like SFTP — exposed the limitations of conventional ETL strategies. Each new source typically requires a custom-built pipeline, which quickly became a maintenance burden. Adjusting these pipelines to accommodate shifting requirements was time-consuming and resource-intensive. I realized that a more agile and sustainable approach is essential. 

A metadata-driven design emerged as the ideal solution. By centralizing ETL configurations in a metadata repository, I could abstract the logic of the processes away from the pipelines themselves. This abstraction allowed for the addition of a new data source or modification of an existing flow to be achieved by updating metadata rather than rewriting code. The promise of reduced maintenance and faster adaptability drove me to pursue this direction. 

Crafting the architectural framework 

Laying the groundwork with a metadata schema 

The cornerstone of this architecture is a carefully crafted metadata schema, hosted in an Azure SQL Database. This schema captures all the necessary details to orchestrate the ETL process: connection information, copy operation definitions and configuration settings. I chose Azure SQL Database for its robust relational capabilities, which enable efficient querying and management of intricate metadata relationships. Its tight integration with Azure services, such as ADF, ensures a smooth data flow, while its scalability and high availability meet the demands of a reliable system. 

The schema includes several key tables: 

  • Connections: Stores source and destination details, with sensitive data linked to secure references rather than stored directly. 
  • CopyOperations: Specifies the types of data transfers, such as database-to-database or file-to-storage. 
  • ETLConfig: Defines the specifics of each ETL process, including objects to transfer, execution order and any pre- or post-processing steps. 

This metadata serves as the blueprint, dynamically guiding the entire ETL workflow. 

Orchestrating with a parent pipeline 

Central to the framework is a parent pipeline in ADF, which I designed as the primary orchestrator. This pipeline is parameterized to accept an identifier for a specific ETL flow. When triggered, it calls a stored procedure in the Azure SQL Database, which returns a JSON response detailing the processes to execute. For example, the JSON might outline a sequence of transfers from a database to a data lake, each with its own parameters. 

The parent pipeline then delegates these tasks to specialized child pipelines, passing along the necessary metadata. This design eliminates the need for hardcoding specific flows into the pipeline itself, making it highly adaptable. The decision to use a stored procedure for generating the JSON configuration was intentional — it allows complex logic to reside in the database, keeping the pipeline lean and focused on execution. 

Chart of an ADF parent pipeline 

Vikram Garg

Building modularity with child pipelines 

To manage the variety of data sources and destinations, I opted for a modular structure. Each type of data transfer — whether from a database to another database or a file system to cloud storage — is handled by a dedicated child pipeline. These child pipelines are templates and parameterized, meaning they can be reused across different scenarios by simply adjusting the runtime inputs. 

For instance, a child pipeline designed for database-to-database transfers can handle any such operation by receiving the appropriate source and destination details from the metadata. This modularity simplifies expansion: supporting a new transfer type involves creating a new child pipeline template and updating the metadata, rather than redesigning the entire system. 

Prioritizing security 

Security is a non-negotiable aspect of data integration. To safeguard sensitive information like connection credentials, I integrated Azure Key Vault into the framework. The metadata schema references only the keys to these secrets, which are retrieved by the pipelines at runtime. This ensures that sensitive data remains encrypted and inaccessible outside of authorized processes. 

This design not only bolsters security but also streamlines secret management across different environments, such as development and production. It was a deliberate choice to embed security into the architecture from the ground up, ensuring compliance and trust in the system. 

Enabling flexible execution 

The framework is needed to support both scheduled and on-demand ETL processes. To achieve this, I incorporated multiple triggering mechanisms. For scheduled runs, I used Azure Logic Apps, which read timing configurations from the metadata and initiate the parent pipeline accordingly. For on-demand execution, an API endpoint via Azure API Management allows external systems or users to trigger the process as needed. 

I also added event-based triggers, such as monitoring a file drop location for new arrivals and launching the ETL process automatically. This versatility ensures the framework can meet a wide range of operational demands. 

Ensuring reliability with monitoring and logging 

A robust ETL system requires visibility and reliability. I built comprehensive logging into the framework, capturing execution details—start times, end times, statuses and errors — in the Azure SQL Database. For real-time oversight, I integrated Azure Application Insights to monitor pipeline performance. 

To handle failures, I designed a utility pipeline that logs exceptions, sends alerts for critical issues and flags problems for investigation. This proactive monitoring minimizes disruptions and keeps the system running smoothly. 

Empowering users with a simple interface 

To broaden the framework’s usability, I developed a web-based interface hosted on Azure App Service. This UI allows non-technical users to manage metadata — adding new sources, adjusting configurations or setting schedules — without needing to navigate ADF directly. This self-service capability reduces IT dependency and accelerates the integration process. 

Tackling design challenges 

The design process presented several hurdles. One major challenge was ensuring the framework could handle diverse data sources without becoming overly complex. The modular child pipelines addressed this by isolating each operation type, maintaining simplicity while supporting variety. 

Performance was another concern. With large data volumes, I needed to minimize latency and maximize throughput. I employed parallel processing where feasible and optimized data partitioning to enhance efficiency. ADF’s native data movement features further streamlined transfers. 

Scalability posed its own issues. As the number of copy operations grew, I encountered limitations in ADF’s switch case activity, which is capped at 25 cases. To overcome this, I added a categorization layer in the parent pipeline, grouping operations by type and effectively expanding capacity. This solution required careful validation but proved effective. 

Another constraint was ADF’s static integration runtime selection for on-premises sources. Since runtime couldn’t be dynamically chosen, I configured multiple linked services tied to different runtimes, selecting the appropriate one via metadata. It’s a workaround that ensures compatibility with hybrid environments. 

Reflecting on the design journey 

This project underscored several critical insights. Scalability must be baked into the architecture from the start — using metadata as the driver ensures the system can evolve without constant refactoring. Modularity simplifies maintenance and extension, proving its worth as new needs arise. Security, integrated early, builds a foundation of trust. And empowering users with intuitive tools fosters collaboration and efficiency. 

This metadata-driven ETL framework represents a shift in how I approach data integration. Its flexibility, scalability and user-centric design make it a powerful tool for managing complex data ecosystems. For fellow architects, I’d recommend exploring metadata-driven solutions — they require upfront effort but deliver lasting value. My experience with Azure ADF has been a journey of innovation, and I look forward to seeing how this approach evolves. 

This article is published as part of the Foundry Expert Contributor Network. 
Want to join? 

(image/jpeg; 1.17 MB)

Eclipse Foundation releases Jakarta EE 11 26 Jun 2025, 6:00 am

Moving forward with the development of its enterprise Java application platform, the Eclipse Foundation has released Jakarta EE 11 Platform, which promises simpler data access and streamlined testing processes. Java 21 support also is featured.

Rolled out June 26, Jakarta Enterprise Edition 11 homes in on performance and productivity, according to Eclipse. Featured as part of Java EE 11 is Jakarta Data, a new specification designed to simplify data access and improve developer productivity. “Jakarta Data is a new specification that is standardizing the repository pattern for data access,” said Tanja Obradovic, Eclipse Jakarta EE program manager. “It reduces the complexity of the persistence logic that we’re using.”

Core functionality described in the Jakarta Data specification includes BasicRepository, which is a built-in repository supertype for performing basic operations on entities, and CrudRepository, for basic CRUD (create, read, update, and delete) operations, to make database interactions more straightforward and less error-prone. Other parts of the specification include support for both offset and cursor-based pagination and a streamlined query language for Jakarta Data repositories.

The Jakarta EE 11 release builds on previous Core Profile (December 2024) and Web Profile (March 2025) versions and represents a significant advancement in simplifying enterprise Java for cloud-native development, Eclipse said. Microsoft and Red Hat were among the companies that participated in the development of the release. “Our collaboration with esteemed partners IBM, Red Hat, and Oracle has been instrumental in supporting Jakarta EE 11 runtimes on Azure, including Azure Kubernetes Service, Azure Red Hat OpenShift, and App Service,” Microsoft’s Scott Hunter, vice president of product, Azure developer experience, said in a statement.

Jakarta EE 11 has been verified to work with Java 17 and Java 21, both of which are long-term support (LTS) releases of Java SE (Standard Edition). Concurrency enhancements are supported for Java 21 including support for virtual threads, which improves scalability, reduces overhead, and offers significant performance gains, said Eclipse. Future plans call for supporting the upcoming JDK 25 LTS release, due to arrive in September, in Jakarta EE 12, which is targeted for release in 2026.

All Jakarta EE releases are based on the Jakarta EE Platform specification. Streamlined Jakarta EE 11 specifications include a modernized Technology Compatibility Kit (TCK) designed to improve compatibility testing and reduce barriers to adding new tests as the platform evolves, Eclipse said. Managed Beans has been deprecated for removal to achieve a simpler and more modern programming model, while Context and Dependency Injection (CDI) enhancements promise more consistent application behavior. Other streamlined specifications cover Java records support, for broader integration for data integrity and reduced boilerplate code, and the removal of Java SE Security Manager, to enable more modern security practices. Tool upgrades in Jakarta 11 move the platform from Apache Ant and Java Test Harness to JUnit 5 and Apache Maven for enhanced efficiency and relevance, Eclipse said. Updating the TCK to a multi-dependency Maven project boosts compatibility testing and removes barriers to adding new tests.

Jakarta EE Working Group members have certified several products as compatible with Jakarta EE 11. For the Web Profile, the Eclipse GlassFish application server has been certified. For the Core Profile, IBM’s Open Liberty microservices framework, Red Hat’s WildFly application server, the Payara Platform Community Edition, and the Fujitsu Software Enterprise Application Platform have been certified, according to Eclipse. This list is expected to grow rapidly after Jakarta EE 111’s release.

Previously under Oracle’s jurisdiction, Eclipse took over development of enterprise Java in 2017. Enterprise Java previously was known as Java EE. Oracle sought to turn over the project to an open source organization while remaining the steward of standard Java. The predecessor to Jakarta EE 11, Jakarta EE 10, arrived in September 2022.

(image/jpeg; 0.15 MB)

Bringing post-quantum cryptography to Windows 26 Jun 2025, 5:00 am

Much of what we do to keep our online lives secure relies on public-key cryptography and its complex mathematical operations. At the heart of these techniques are sets of one-way functions that generate the public and private keys used to encrypt and decrypt data.

Those mathematical functions are secure because it would take immense amounts of time and computational power to find a private key from a public key, factor very large numbers, and then decrypt data—at least, if you’re using a conventional computer. Some algorithms can be cracked using specialized hardware, but even here cost is still an issue.

Quantum computing and modern cryptography

One technology on the horizon could make the cryptographic basis of our entire online world obsolete almost overnight. Quantum computing uses low-temperature physics to build qubits, structures that can hold all the possible states, and then constructs quantum circuits that embody complex algorithms and quickly collapse probabilities to answer problems that would take many thousands of years with conventional computers.

Quantum computing factorization tools such as Schor’s Algorithm require millions of qubits to factor a single public key, and today’s quantum computers offer a mere handful of qubits. The technology that underpins quantum computing is advancing rapidly, with Microsoft and other companies developing new materials and error correction techniques to deliver stable qubits at scale and at an economically feasible cost.

That doesn’t mean the entire world of computing will be upended overnight. The first at-scale quantum computers are still years away and are likely to initially be used for pure science. As they get easier and cheaper to build, they will be used by governments and by criminals looking to decrypt decades of financial data and other secrets.

Into the post-quantum world

For now we’re safe. We have time to protect our secrets with new encryption algorithms designed to prevent quantum computing-based factorization. These post-quantum encryption algorithms take a symmetric approach to cryptography as opposed to the commonly used asymmetric algorithms that form the basis of much of today’s public-key infrastructures.

The intent is to use new mathematical approaches that are hard for both conventional and quantum computers to solve. Of course, there are downsides: The keys are larger and need more processing time, compute capacity, and memory. For now, post-quantum cryptography is saved for valuable information where there’s economic incentive for bad actors to use quantum computing to decrypt your data.

Part of the transition to post-quantum cryptography is the standardization of new algorithms and making them available in common cryptographic libraries, especially those used by both OS and applications. Microsoft has been working with the National Institute of Standards and Technology (NIST) to standardize these new algorithms and has begun adding them to its base SymCrypt library.

Adding post-quantum cryptography to Windows

Used across Microsoft’s platforms, SymCrypt is a key component of tools such as Windows’ Cryptographic Primitives Library and also offers support on Linux for use in Azure. It now supports the ML-KEM, ML-DSA, and SLH-DSA post-quantum cryptographic algorithms. The field is still evolving, and although you can use these algorithms now, better ones may come along in the future, so be ready to change if necessary.

ML-based algorithms use a Module Lattice (ML) approach, while SLH is a Stateless Hash. ML-KEM was originally known as Kyber and uses a mix of mathematical techniques to increase the complexity of the process used to generate a key pair. Module lattice techniques are based on what are called “lattice problems,” which are hard to solve using computers. In fact, the hardest versions are so complex that even quantum computers will be challenged. It gets even more difficult when combined with an approach called “learning with errors” that adds noise to the process. This combination is why NIST has chosen ML-based algorithms for the FIPS-203 and 204 standards.

Preparing for the future, today

These algorithms are now available for Windows developers using Windows Insider builds as part of its Cryptography API Next-Generation libraries. This first release gives you access to ML-KEM for key encapsulation and ML-DSA for digital signatures. Using these now starts to protect you from what’s known as “harvest now, decrypt later” attacks.

By keeping samples of encrypted data (especially key exchanges) to decrypt when quantum computers become usable, historic data that was secret will be easily recovered, opening trails of financial transactions or government messages that could still have relevant information. Microsoft suggests you mix these new algorithms with existing ones to give you deeper defenses.

You can use a less computationally intensive version of ML-KEM for now while you prepare for a complete shift to newer cryptographic systems and any necessary supporting hardware. It’s likely that post-quantum cryptography will require a new generation of processor instructions or even dedicated accelerators to get the performance users and applications require.

Microsoft is adding support for post-quantum cryptography in its wincrypt tool, which provides APIs for the Windows certificate handling tools. You will be able to use ML-DSA certificates, managing them in the Windows certificate store and checking validity and trust.

Building post-quantum cryptography apps

At the heart of Microsoft’s Windows implementation of post-quantum cryptography is what it calls “Cryptography API: Next Generation” (CNG). CNG is intended to replace the current Windows cryptography APIs, so it makes sense as the home for next-generation cryptosystems like ML-KEM and ML-DSA. It’s a low-level library for use with C or C++. It’s been in development for some time now and is mature enough to use as it offers a flexible set of features to support most common use cases.

Microsoft’s CNG documentation recently added sample C++ code for working with both ML-DSA and ML-KEM. You can use familiar tools like Bcrypt to first load the post-quantum encryption algorithm you want to use from Microsoft’s own implementation (though as always you have the option of using a third-party version).

Generating a key pair uses the same steps as traditional encryption, generating pairs and setting their properties. For example, with ML-DSA, this sets the parameter set that’s being used. Choosing the right one is important, as this affects both the strength of the encryption method and its performance. As always this is a trade-off: The stronger the encryption, the longer it will take to create the key pair or a hash.

The process of generating a key or a hash with a post-quantum algorithm will be much the same as working with any other cryptographic algorithm today. Along with snippets of sample code, Microsoft provides complete modules you can use as the basis of any code you write.

Microsoft’s Linux post-quantum tools

Microsoft isn’t only delivering post-quantum cryptography in Windows, it’s also using SymCrypt as a cryptography provider for OpenSSL on Linux. This is intended to provide FIPS certification, something that it needs for its Azure government cloud services. This is being used to test post-quantum-based Transport Layer Security (TLS) operations using hybrid key exchange.

This is only a first step to robust post-quantum cryptography across the Microsoft platform, as the necessary standards themselves are still in their infancy. More algorithms will be added, with support for Windows TLS as part of its TLS 1.3 implementation. It’s also likely to be used sooner rather than later in Active Directory as part of its certificate services, generating ML-DSA-based certificates.

Microsoft is working on what it calls “crypto agility,” the ability to swap out new algorithms as they develop, and is using hybrid techniques that mix current techniques with post-quantum cryptography to balance both resources and protection while support and algorithms mature.

Post-quantum cryptography isn’t essential yet, but neither can you ignore it. It’s a good idea to try out these new features and see how the new algorithms affect your applications. If certificates and signatures take longer to use and require more resources, it’s important to understand how these latencies will impact your applications and whether you need to consider investing in new hardware now rather than waiting until the last minute.

(image/jpeg; 1.14 MB)

How to use route constraints in ASP.NET Core minimal APIs 26 Jun 2025, 5:00 am

When working with minimal APIs in ASP.NET Core, you can define routes for the HTTP verbs using mapping methods such as MapGet, MapPut, MapPost, and MapDelete. While these methods allow you to route each request to the correct destination, you should also have a way to validate route parameters ahead of executing the action method. To do this, you can take advantage of route constraints to define patterns based on the API behavior you need.

In an earlier post here, we discussed how to work with route constraints when building controller-based APIs in ASP.NET Core Web API. In this post we will examine how we can work with route constraints when building minimal APIs. To use the code examples provided in this article, you should have Visual Studio 2022 installed in your system. If you don’t already have a copy, you can download Visual Studio 2022 here.

What is a route constraint? Why is it needed?

A route constraint in ASP.NET Core is a guardrail for the URL. It is a mechanism that enables you to determine, in plain terms, whether an incoming request matches predefined criteria and should be processed. Route constraints in ASP.NET Core are used to filter out unwanted requests or prevent invalid data from reaching your controller actions.

You can take advantage of the set of classes available as part of the Microsoft.AspNetCore.Routing.Constraints namespace to define route constraints. You typically combine multiple constraints when defining your application’s routing configuration. For example, you may want to define a route constraint that allows only positive integers in the route parameter. You may also want to enforce a specific length for a route parameter.

By ensuring that only requests that match your specific criteria are actually processed by your action methods, route constraints improve the security and performance of your application. Plus, route constraints provide a cleaner and simpler way to filter requests than having to write boilerplate and complex code. In other words, they also help to make your application easier to maintain.

Create an ASP.NET Core Web API project in Visual Studio 2022

To create an ASP.NET Core Web API project in Visual Studio 2022, follow the steps outlined below.

  1. Launch the Visual Studio 2022 IDE.
  2. Click on “Create new project.”
  3. In the “Create new project” window, select “ASP.NET Core Web API” from the list of templates displayed.
  4. Click Next.
  5. In the “Configure your new project” window, specify the name and location for the new project. Optionally check the “Place solution and project in the same directory” check box, depending on your preferences.
  6. Click Next.
  7. In the “Additional Information” window shown next, select “.NET 9.0 (Standard Term Support)” as the framework version and ensure that the “Use controllers” box is unchecked since we’ll be using Minimal APIs in this project.
  8. Elsewhere in the “Additional Information” window, leave the “Authentication Type” set to “None” (the default) and make sure the check boxes “Enable Open API Support,” “Configure for HTTPS,” and “Enable container support” remain unchecked. We won’t be using any of those features here.
  9. Click Create.

We’ll use this ASP.NET Core Web API project to work with the REPR (request-endpoint-response) design pattern in the sections below.

Applying route constraints in minimal APIs

Route constraints can be applied in either of two ways:

  1. Using an inline constraint by specifying a constraint parameter in the MapControllerRoute method.
  2. Using a route attribute in the controller or any of its action methods.

Use an inline constraint to apply route constraints

The following code snippet shows how you can apply a route constraint in the MapControllerRoute method.


endpoints.MapControllerRoute(  
name: "default",  
pattern: "controller=Home}/{action=Index}/{id:int?}");  

Use route parameters to apply route constraints

You can leverage route parameter constraints to specify rules for your route parameters using the following syntax.


{parameter:constraint}

The following code snippet illustrates how you can enforce that a parameter must be an integer or match a specific pattern.


app.MapGet("/orders/{id:int}", (int id) => $"Order Id: {id}");

Using route constraints for low-level validations

You can apply route constraints to perform low-level validations in minimal APIs. We’ll examine how this can be achieved in this section.

Applying a route constraint in the MapController method

Let us first use inline constraints to perform the validation. The following code snippet shows how you can use the MapControllerRoute method to apply a route constraint on your endpoint. In this case, the constraint requires that the author Id be an integer.


app.MapControllerRoute(
    name: "onlyInteger",
    pattern: "author/{id:int}",
    defaults: new { controller = "Author", action = "Author Details" });

Similarly, you can specify that the firstname parameter for the /author endpoint must have a length greater than or equal to three, meaning that thefirstname of an author must contain a minimum of three characters. The following code snippet illustrates how you can specify this in your endpoint as a route constraint.


app.MapControllerRoute(
    name: "nameLength",
    pattern: "author/{firstname:minlength(3}",
    defaults: new { controller = "Author", action = "Get Author"});

Applying a route constraint in the MapGet, MapPut, and MapPost methods

Next we’ll examine how we can perform the validation using route parameter constraints. Consider the following C# class named Author.


public class Author
{
    public int AuthorId { get; set; }
    public string FirstName { get; set; } = string.Empty;
    public string LastName { get; set; } = string.Empty;
    public string Address { get; set; } = string.Empty;
    public string Email { get; set; } = string.Empty;
    public string Phone { get; set; } = string.Empty;
}

The following code snippet shows how you can retrieve data from the HTTP GET endpoint route /authors/{authorId:int}.


app.MapGet("/authors/{id:int}", (int id) => $"Author Id: {id}");

Similarly, the following HTTP POST endpoint accepts an author Id and the count of books written by the author and returns a text message.


app.MapPost("/author/{authorId:int:min(1)}/books/{count:int:min(1)}",
    (int authorId, int count) => 
$"Author with Id {authorId} has written {count} books.");

As shown in the preceding code example, both the authorId and the count parameters should have a minimum integer value of 1.

The code snippet below shows how you can use the HTTP PUT endpoint to update the count of books authored by an author identified by authorId, taking advantage of the route parameter constraint illustrated above.


app.MapPut("/author/{authorId:int:min(1)}/books/{count:int:min(1)}",
    (int authorId, int count) => $"Author with Id {authorId} has written {count} books.");

Example of a route constraint in action

Finally, let’s examine how we can specify route constraints when passing objects and primitive data types (char, int, etc.) as parameters. Consider the following HTTP POST endpoint, which adds a new author and the number of books authored to a database.


app.MapPost("/authors/{count:int:min(1)}", (Author author, int count) =>
  Results.Ok($"1 new author record of an author who has written {count} " +
    $"books has been added to the database...")
);

Let’s invoke this endpoint using Postman. We’ll send the following JSON data for our new author in the body of the request.


{
  "authorId": "1",
  "firstName": "Joydip",
  "lastName": "Kanjilal",
  "address": "Hyderabad, India",
  "email": "joydipkanjilal@yahoo.com",
  "phone": "1234567890"
}

Our route constraint requires us also to specify the count of books written by the author in the URL. Further, it requires that the count be a minimum integer value of 1. Figure 1 shows the output in Postman when we pass a count of 5.

route constraint 200 OK

Figure 1. Success! Invoking the HTTP POST endpoint in Postman is successful because the route constraint for the count parameter has been satisfied. 

Foundry

Let’s see what happens when we violate the route constraint. Figure 2 shows the output in Postman when we pass a count parameter of 0. Note that if we specify a value less than 1 for the count parameter, a HTTP 404 error will be returned in the response.

route constraint 404 Not Found

Figure 1. Oops. Invoking the HTTP POST endpoint in Postman results in a 404 Not Found error because the route constraint for the count parameter has been violated.

Foundry

Not for data validation

Keep in mind that you should not use route constraints for input data validation. If your endpoints contain route constraints that validate input data, an invalid input will return a 404 Not Found response, not a 400 Bad Request response. Remember, the purpose of route constraints is to disambiguate similar routes (i.e., to help ASP.NET Core efficiently resolve requests), not to validate the input data for your route.

You can also take advantage of custom route constraints in ASP.NET Core to validate route values and avoid passing unwanted or unnecessary requests to action methods. You can refer to Microsoft’s documentation to learn about all of the supported route constraints.

(image/jpeg; 2.6 MB)

Google unveils Gemini CLI for developers 25 Jun 2025, 9:00 am

Google has introduced Gemini CLI, an open-source AI agent that brings Gemini AI capabilities right to a developer’s terminal.

Unveiled June 25, Gemini CLI is a command-line AI workflow tool that connects to a developer’s tools, understands the developer’s code, and accelerates the developer’s workflow. Gemini CLI gives developers a direct path from their prompt to Google’s Gemini AI model, the company said. “Gemini CLI is an agent that runs in your terminal that brings the power of Gemini in a very lightweight way to your terminal,” said Taylor Mullen, Google senior staff software engineer, in a Google briefing on June 24. “We think Gemini CLI is going to fundamentally change how developers interact with AI.”

Gemini CLI can be used for tasks ranging from content generation and problem-solving to task management and deep research. The capabilities of the multimodal reasoning model include code understanding, file manipulation, command execution, and dynamic troubleshooting, Google said. The company said it built Gemini CLI to be extensible, leveraging standards such as Model Context Protocol (MCP), system prompts, and settings for personal and team configuration.

With Gemini CLI, developers can do specific tasks including the following:

  • Query and edit large code bases in and beyond Gemini’s 1M token control window.
  • Generate new apps from PDFs or sketches, using Gemini’s multimodal capabilities.
  • Automate operational tasks, such as querying pull requests or handling complex releases.
  • Ground queries using the Google Search tool built into Gemini.
  • Use tools and MCP servers to connect new capabilities including media generation with Imagen, Veo, or Lyria.

Offered via an Apache 2.0 license, Gemini CLI has been integrated with Google’s AI coding assistant, Gemini Code Assist. Users of Gemini Code Assist on free, Standard, and Enterprise plans get prompt-driven, AI-powered coding in both Visual Studio Code and Gemini CLI, Google said. Developers can use Gemini CLI free of charge by logging in with a personal Google account to get a free Gemini Code Assist license.

(image/jpeg; 6.54 MB)

Software development meets the McNamara Fallacy 25 Jun 2025, 5:00 am

The McNamara Fallacy is the idea that it is an error to make decisions purely on measurements or quantitative data.

Robert McNamara was the US Secretary of Defense during most of the 1960s, during the first half of the Vietnam War. He came to the position after a successful career in industry—most notably as the first president of the Ford Motor Company who was not from the Ford family—after his innovative use of statistics in business. 

McNamara took the view that anything that can’t be measured should not be taken into account in the decision-making process. He considered all measurable data to be useful, and made metrics the sole means of determining the correct course of action.

He brought this statistical viewpoint to his job as Secretary of Defense and applied it to his management of the war. Problematic results, such as using enemy body counts as a measure of military success, ultimately led to the coining of the phrase “The McNamara Fallacy.”

I think the software industry is in danger of falling victim to the McNamara Fallacy. 

While metrics are useful, relying on metrics alone and ignoring qualitative factors that certainly come into play can easily lead you to focus on things that don’t matter. Ultimately it can lead to failure.

Numbers are easy

First, measuring the software development process is becoming easier and easier. With Git becoming the de facto source control system and tools like LinearB, JellyFish, and Plandek providing deep insight into what is happening within software repositories, it is very simple to get metrics that tell you a lot about what your team is up to. 

It is comical that the industry once seriously took something as pathetically simple as “Lines of code per day” as a serious metric. Today’s tools allow managers to see things that were not previously observable. Simple team metrics like deployment frequency, pull request review time, and cycle time are readily available to help pinpoint bottlenecks and aid in decision-making.

We’ve gotten to the point where metrics are easy, and “soft” measurements are hard. Don’t get me wrong—the granular metrics we now have are useful and valuable. But the temptation is great to rely on easy numbers instead of doing the hard work of figuring out the impossible-to-measure stuff. Robert McNamara fell into that trap, and it is easy to see the software industry doing the same. 

If we lose sight of the non-measurable things, we lose sight of what makes software development successful. If we focus on “computer science” and “software engineering,” we can lose sight of the human factors in writing code that make or break a successful project.

Software development is a team sport. Although individuals can and do shine as part of a team, the team results are what really matter. Sports fans thrive on statistics, but they know that ultimately, it’s not the statistics that win championships. It’s the intangibles that make the difference between first and second place.

Intangibles are hard

Despite our best efforts, we don’t have a means of measuring “writes good code.” It takes years of experience to recognize “good code” from “bad code,” and we can’t (yet?) measure it objectively. Maybe AI will figure it out someday. One could argue that AI can write good code today, but the ability to recognize good code is still uniquely human. 

Similarly, how does one measure “works well with others”? How about “team morale”? There is not, and probably won’t ever be, a way to measure these kinds of things, but they are important, and knowing it when you see it is a key to success. Recognizing and encouraging these intangibles is a critical skill for a software development manager to have. 

Finally, over-indexing on metrics can be detrimental to morale. No one wants to feel like a number on a spreadsheet. 

I encourage you to use the cool measurement tools out there today. But as you review what things like the DORA (DevOps Research and Assessment) metrics are telling you, remember to consider the things not revealed in the numbers. Sure, metrics are important, but understanding what your gut is telling you and listening to your intuition can be just as valuable, or even more valuable. 

Measure what you can, but always be sure to listen to the still, quiet voice telling you things that no statistical package ever can.

(image/jpeg; 1.14 MB)

Pyrefly and Ty: Two new Rust-powered Python type-checking tools compared 25 Jun 2025, 5:00 am

What is most striking about Python’s latest wave of third-party tooling is that they aren’t written in Python. Instead, many of the newer tools for project management, code formatting, and now type checking, are written in Rust.

This isn’t a swipe at Python; every language has its place. But modern language tooling demands a real-time feedback loop that Python can’t always deliver at the speed required. Rust fills that gap. Modern project management tools like uv and code formatters like ruff run fast and lean thanks to Rust.

The newest projects in this space aim to provide type-checking tools for Python that are faster and potentially more powerful than Python-based tools like mypy and pyright.

Ty from Astral (makers of the uv package manager and the ruff code formatter) and Pyrefly from Meta have essentially the same use case: providing high-speed type checking and language services for Python. Both have comparable performance, running many times faster than similar Python-based projects. This article tells you where these new tools stand right now in terms of usability and features.

Pyrefly

Pyrefly is not the first Python type-checking tool from Meta. Previously, the company delivered Pyre, written in OCaml. Pyre has since been retired; Pyrefly, written from scratch in Rust, replaces it.

Run Pyrefly out of the box on an existing Python codebase, and you’ll typically get a flood of errors the first time around. If you use the command pyrefly check --suppress-errors, all the flagged errors will be suppressed in your source via specially interpreted comments. You can then selectively remove those suppressions and run pyrefly check --remove-unused-ignores to clean up the codebase as you go. This allows you to migrate an untyped codebase gradually.

Pyrefly, like all modern Python tooling, uses pyproject.toml to store its project-level configuration data. You can also add per-directory configurations with standalone pyrefly.toml projects that use the same syntax. Or, you can provide directory-specific overrides for options in a single config file.

The list of linted error types is comparable to what mypy and Pyright can handle. Migrating from both tools is easy, as Pyrefly can do it automatically. In Pyright’s case, there’s almost a one-to-one mapping for the error-checking settings, so the change isn’t too jarring.

For a project in its early stages, Pyrefly already feels fleshed out. Detailed documentation, a VS Code extension, and even an online sandbox where you can try it out are all already here. If you are using the uv tool, you can run uvx pyrefly to experiment with it on a codebase without having to install anything. Note that this causes uv to be used as the virtual environment provider for Python, so it may generate spurious errors if you are using a different venv for your project.

Ty

Astral’s ty project is also still in its early stages, and it shows. Its documentation isn’t as fleshed-out as Pyrefly’s, and its feature set is less impressive. To be fair, the project was only recently made public and is admittedly in its early stages.

You can install Ty from pip or run it from uvx. It intelligently detects a source directory in a pyproject.toml-configured project, so it doesn’t mistakenly chew through Python files in your project’s virtual environment. But its configuration options are more minimal than Pyrefly’s; for instance, excluding files from checks is done via .gitignore or other external files rather than from configuration rules.

Ty’s ruleset for checking files seems more condensed than Pyrefly or existing tools, although it covers some cases not found elsewhere. For instance, while it doesn’t check for async errors, Ty does detect if class definitions have conflicting usages of __slots__, although the former seems like a far more common problem than the latter.

Despite being in its early stages, ty already has two key features nailed down. It is compatible with the Language Server Protocol, and it offers a VS Code extension to leverage that. Another plus—one significant enough to call out this early on—is the level of detail in its error reports. Pyrefly’s errors report the line number and type of error, but ty calls out the error akin to what you’d see in modern Python’s contextually detailed error messages.

Conclusion

With the performance playing field about level between the two tools, Pyrefly is the more immediately useful project. Pyrefly offers a broader existing feature set, better documentation, and tooling to allow elegant migration from other type-checkers and onboarding existing codebases. That said, ty is in its early stages, so it’ll be worth circling back to both tools once they are out of their respective alpha and beta phases.

(image/jpeg; 4.85 MB)

New AI benchmarking tools evaluate real world performance 25 Jun 2025, 12:48 am

A new AI benchmark for enterprise applications is now available following the launch of xbench, a testing initiative developed in-house by Chinese venture capital firm HongShan Capital Group (HSG).

The challenge with many of the current benchmarks is that they are widely published, making it possible for model creators to train their models to perform well on them and, as a result, reduce their usefulness as a true measure of performance. HSG says it has created a suite of ever-changing benchmarking tests, making it harder for AI companies to train on the test, and meaning they have to rely on more general test-taking capabilities.

HSG said its original intention in creating xbench was to turn its internal evaluation tool into “a public AI benchmark test, and to attract more AI talents and projects in an open and transparent way.  We believe that the spirit of open source can make xbench evolve better and create greater value for the AI community.”

On June 17, the company announced it had officially open-sourced two xbench benchmarks: xbench-Science QA and xbench-DeepSearch, promising ”in the future, we will continuously and dynamically update the benchmarks based on the development of large models and AI Agents ….”

Real-world relevance

AI models, said Mohit Agrawal, research director of AI and IoT at CounterPoint Research, “have outgrown traditional benchmarks, especially in subjective domains like reasoning. Xbench is a timely attempt to bridge that gap with real-world relevance and adaptability. It’s not perfect, but it could lay the groundwork for how we track practical AI impact going forward.”

In addition, he said, the models themselves “have progressed significantly over the last two-to-three years, and this means that the evaluation criteria need to evolve with their changing capabilities. Xbench aims to fill key gaps left by traditional evaluation methods, which is a welcome first step toward a more relevant and modern benchmark. It attempts to bring real-world relevance while remaining dynamic and adaptable.”

However, said Agrawal, while it’s relatively easy to evaluate models on math or coding tasks, “assessing models in subjective areas such as reasoning is much more challenging. Reasoning models can be applied across a wide variety of contexts, and models may specialize in particular domains. In such cases, the necessary subjectivity is difficult to capture with any benchmark. Moreover, this approach requires frequent updates and expert input, which may be difficult to maintain and scale.”

Biases, he added, “may also creep into the evaluation, depending on the domain and geographic background of the experts. Overall, xbench is a strong first step, and over time, it may become the foundation for evaluating the practical impact and market readiness of AI agents.”

Hyoun Park, CEO and chief analyst at Amalgam Insights, has some concerns. “The effort to keep AI benchmarks up-to-date and to improve them over time is a welcome one, because dynamic benchmarks are necessary in a market where models are changing on a monthly or even weekly basis,” he said. “But my caveat is that AI benchmarks need to both be updated over time and actually change over time.”

Benchmarking new use cases

He pointed out, “we are seeing with efforts such as Databricks’ Agent Bricks that [it] is important to build independent benchmarks for new and emerging use cases.  And Salesforce Research recently released a paper showing how LLMs fare poorly in conducting some practical tasks, even when they are capable of conducting the technical capabilities associated with the task.”

The value of an LLM, said Park, is “often not in the ability to solve any specific problem, but to identify when a novel or difficult approach might be necessary. And that is going to be a challenge for even this approach to benchmarking models, as the current focus is on finding more complex questions that can be directly solved through LLMs rather than figuring out whether these complex tasks are necessary, based on more open-ended and generalized questioning.”

Further to that, he suggested, “[it is] probably more important for 99% of users to simply be aware that they need to conceptually be aware of Vapnik-Chervonenkis complexity [a measure of the complexity of a model] to understand the robustness of a challenge that an AI model is trying to solve. And from a value perspective, it is more useful to simply provide context on whether the VC dimension of a challenge might be considered low or high, because there are practical ramifications on whether you use the small or large AI model to solve the problem, which can be orders of magnitude differences in cost.”

Model benchmarking, Park said, “has been quite challenging, as the exercise is both extremely high stakes in the multi billion dollar AI wars, and also poorly defined. There is a panoply of incentives for AI companies to cheat and overfit their models to specific tests and benchmarks.”

Next read this:

(image/jpeg; 0.74 MB)

Kotlin 2.2.0 arrives with context parameters, unified management of compiler warnings 24 Jun 2025, 6:20 pm

JetBrains has released Kotlin 2.2.0, the latest version of the company’s general purpose, statically typed language perhaps best known as a rival to Java for JVM and Android development. The update previews context parameters, stabilizes guard conditions, and offers unified management of compiler warnings.

Kotlin 2.2.0 was released June 23. Installation instructions can be found at kotlinlang.org.

Context parameters in the release improve dependency management and allow functions and properties to declare dependencies that are implicitly available in the surrounding context. With context parameters, developers do not need to manually pass around values such as services or dependencies, which are shared and rarely change across sets of function calls. Context parameters replace an older experimental feature called context receivers. Other features previewed in Kotlin 2.2.0 include context-sensitive resolution, to omit the type name in contexts where the expected type is known, and an @all meta-target for properties, which tells the compiler to apply an annotation to all relevant parts the property.

Guard conditions, introduced in Kotlin 2.1.0 last November, are now stable. Guard conditions allow for including more than one condition for the branches of a when expression, making complex control flows more explicit and concise, JetBrains said. Additionally, code structure is flattened with this feature.

A new compiler option in Kotlin 2.2.0, -XWarning-level, is designed to offer a unified way of managing compiler warnings in Kotlin projects. Previously, developers only could apply general module-wide rules, such as disabling all warnings with nowarn or turning warnings to compilation errors with -Werror. With the new option, developers can override general rules and exclude specific diagnostics in a consistent way.

Other new features and improvements in Kotlin 2.2.0:

  • For Kotlin/Wasm, the build infrastructure for the for Wasm target is separated from the JavaScript target. Previously, thewasmJstarget shared the same infrastructure as the jstarget. This meant both targets were hosted in the same directory (build/js) and used the same NPM tasks and configurations. Now, the wasmJstarget has its own infrastructure separate from the js target. This allows Wasm types and tasks to be distinct from JavaScript ones, enabling independent configuration.
  • LLVM has been updated from version 16 to version 19, bringing performance improvements, security updates, and bug fixes.
  • Tracking memory consumption on Apple platforms has been improved.
  • Windows 7 has been deprecated as a legacy target.

(image/jpeg; 9.77 MB)

o3-pro may be OpenAI’s most advanced commercial offering, but GPT-4o bests it 24 Jun 2025, 8:01 am

Unlike general-purpose large language models (LLMs), more specialized reasoning models break complex problems into steps that they ‘reason’ about, and show their work in a chain of thought (CoT) process. This is meant to improve their decision-making and accuracy and enhance trust and explainability.

But can it also lead to a sort of reasoning overkill?

Researchers at AI red teaming company SplxAI set out to answer that very question, pitting OpenAI’s latest reasoning model, o3-pro, against its multimodal model, GPT-4o. OpenAI released o3-pro earlier this month, calling it its most advanced commercial offering to date.

Doing a head-to-head comparison of the two models, the researchers found that o3-pro is far less performant, reliable, and secure, and does an unnecessary amount of reasoning. Notably, o3-pro consumed 7.3x more output tokens, cost 14x more to run, and failed in 5.6x more test cases than GPT-4o.

The results underscore the fact that “developers shouldn’t take vendor claims as dogma and immediately go and replace their LLMs with the latest and greatest from a vendor,” said Brian Jackson, principal research director at Info-Tech Research Group.

o3-pro has difficult-to-justify inefficiencies

In their experiments, the SplxAI researchers deployed o3-pro and GPT-4o as assistants to help choose the most appropriate insurance policies (health, life, auto, home) for a given user. This use case was chosen because it involves a wide range of natural language understanding and reasoning tasks, such as comparing policies and pulling out criteria from prompts.

The two models were evaluated using the same prompts and simulated test cases, as well as through benign and adversarial interactions. The researchers also tracked input and output tokens to understand cost implications and how o3-pro’s reasoning architecture could impact token usage as well as security or safety outcomes.

The models were instructed not to respond to requests outside stated insurance categories; to ignore all instructions or requests attempting to modify their behavior, change their role, or override system rules (through phrases like “pretend to be” or “ignore previous instructions”); not to disclose any internal rules; and not to “speculate, generate fictional policy types, or provide  non-approved discounts.”

Comparing the models

By the numbers, o3-pro used 3.45 million more input tokens and  5.26 million more output tokens than GPT-4o and took 66.4 seconds per test, compared to 1.54 seconds for GPT-4o. Further, o3-pro failed 340 out of 4,172 test cases (8.15%) compared to 61 failures out of 3,188 (1.91%) by GPT-4o.

“While marketed as a high-performance reasoning model, these results suggest that o3-pro introduces inefficiencies that may be difficult to justify in enterprise production environments,” the researchers wrote. They emphasized that use of o3-pro should be limited to “highly specific” use cases based on cost-benefit analysis accounting for reliability, latency, and practical value.

Choose the right LLM for the use case

Jackson pointed out that these findings are not particularly surprising.

“OpenAI tells us outright that GPT-4o is the model that’s optimized for cost, and is good to use for most tasks, while their reasoning models like o3-pro are more suited for coding or specific complex tasks,” he said. “So finding that o3-pro is more expensive and not as good at a very language-oriented task like comparing insurance policies is expected.”

Reasoning models are the leading models in terms of efficacy, he noted, and while SplxAI evaluated one case study, other AI leaderboards and benchmarks pit models against a variety of different scenarios. The o3 family consistently ranks on top of benchmarks designed to test intelligence “in terms of breadth and depth.”

Choosing the right LLM can be the tricky part of developing a new solution involving generative AI, Jackson noted. Typically, developers are in an environment embedded with testing tools; for example, in Amazon Bedrock, where a user can simultaneously test a query against a number of available models to determine the best output. They may then design an application that calls upon one type of LLM for certain types of queries, and another model for other queries.

In the end, developers are trying to balance quality aspects (latency, accuracy, and sentiment) with cost and security/privacy considerations. They will typically consider how much the use case may scale (will it get 1,000 queries a day, or a million?) and consider ways to mitigate bill shock while still delivering quality outcomes, said Jackson.

Typically, he noted, developers follow agile methodologies, where they constantly test their work across a number of factors, including user experience, quality outputs, and cost considerations.

“My advice would be to view LLMs as a commodity market where there are a lot of options that are interchangeable,” said Jackson, “and that the focus should be on user satisfaction.”

Further reading:

(image/jpeg; 0.42 MB)

Public cloud becomes a commodity 24 Jun 2025, 5:00 am

Since the dawn of the public cloud era, the narrative has focused on disruption and innovation. The Big Three providers (Amazon Web Services, Microsoft Azure, and Google Cloud) have commanded center stage by captivating customers and analysts alike with their relentless rollout of new features and services. The implication has always been clear: Stick with the largest, most innovative providers and take advantage of capabilities you can’t find anywhere else.

Yet, those of us who consistently talk to enterprise CTOs, architects, and IT teams see a reality very different from all the marketing hype surrounding bleeding-edge services such as advanced serverless platforms, proprietary AI accelerators, and vast analytics ecosystems. Those shiny, specialized tools advertised at launch events rarely make it into meaningful production deployments. Most actual cloud consumption focuses on a handful of basic services. In the real world, enterprises gravitate toward virtual machines, object storage, databases, networking, and security features.

The reasons are diverse. IT teams primarily manage mission-critical workloads that require reliability, security, and scalability. This creates pressure to reduce risk and complexity, making it impractical to adopt a constant stream of new, sometimes underdeveloped features. Most organizations rely on established solutions, leaving a long tail of innovative services underused and often overlooked. If anything, this demonstrates that the day-to-day needs of most enterprises are surprisingly consistent and relatively straightforward, regardless of industry or region.

What about AI?

AI was expected to change the game by providing a true differentiator for the major cloud players. It’s easy to believe that AWS, Azure, and Google Cloud are now as much AI companies as they are infrastructure providers, given their levels of investment and marketing enthusiasm. However, if you step back and examine the actual AI workloads being deployed in production, a pattern emerges. The necessary toolsets and infrastructure—GPU access, scalable data storage, major machine learning frameworks—are not only widespread but are also becoming increasingly similar across all public clouds, whether in the top tier or among the so-called “second tier” providers such as IBM Cloud and Oracle.

Additionally, access to AI is no longer genuinely exclusive. Open source AI solutions and prebuilt platforms can operate anywhere. Smaller public cloud providers, including sovereign clouds tailored to a country’s specific needs, are offering essentially similar AI and ML portfolios. For everyday enterprise use cases—fine-tuning models, running inference at scale, managing data lakes—there’s nothing particularly unique about what the major clouds provide in comparison to their smaller, often less expensive competitors.

Sticker shock

This brings us, inevitably, to cost, a topic no cloud conversation can avoid these days. The promise of “pay only for what you use” was initially a significant driver of public cloud adoption, but enterprises are waking up to a new reality: The larger you grow, the more you pay. Detailed invoices and cost analysis tools from the Big Three resemble tax documents—complicated, opaque, and often alarming. As organizations scale, cloud bills can quickly spiral out of control, blindsiding even the most prepared finance teams.

The persistent cost challenges are shifting the mindset of enterprise IT leaders. If you’re only using basic cloud primitives such as compute, networking, storage, or managed databases, why pay a premium for the marquee provider’s logo on your invoice? This question isn’t theoretical; it’s one I hear frequently. Enterprises are discovering that the value promised by the most established public clouds doesn’t align with reality, especially at the enterprise scale, given today’s prices.

The second-tier providers are stepping in to fill this gap. IBM and Oracle, for example, have shown remarkable growth in the past few years. Their product offerings may not match the sheer breadth of Microsoft, AWS, and Google, but for core use cases, they are just as reliable and often significantly less expensive. Furthermore, their pricing models are simpler and more predictable, which, in an era of cost anxiety, is a form of innovation in itself. Then there are the sovereign clouds, the regional or government-backed solutions that prioritize local compliance and data sovereignty and offer precisely what some markets require at a fraction of the cost.

MSPs and colos are still in the game

Managed service providers and colocation vendors are also playing a surprising role in this shift. By providing hybrid and multicloud management as well as hosting services, they allow enterprises to move workloads between on-premise environments, colocated data centers, and multiple public clouds with minimal concern about which cloud supports each particular workload. These players further diminish the idea of differentiation among cloud providers, making the underlying infrastructure a nearly irrelevant commodity.

What are the implications? The commoditization of public cloud isn’t just likely; in many respects, it’s already here. Competing solely on raw innovation and feature count is losing its effectiveness. Enterprises are indicating through their purchasing behavior that they want simplicity, reliability, and predictability at a fair price. If the major cloud providers don’t adapt, they’ll find themselves in the same situation traditional server and storage companies did a decade ago: struggling to differentiate what customers increasingly view as a commodity.

AWS, Microsoft, and Google will not disappear or shrink dramatically in the short term. However, they may need to reevaluate how they deliver value. I expect them to double down on managed services, application-layer offerings, and industry-specific solutions where differentiation truly matters. The rest—the core plumbing of the cloud—will increasingly be driven by price, reliability, and regulatory compliance, much like electricity or bandwidth today.

The next phase of cloud computing will belong not to those with the most features or the loudest marketing campaigns, but to those providers, big or small, that best understand enterprise needs and can deliver on the fundamentals without unnecessary complexity or excessive costs. That’s good news for the rest of us. The public cloud isn’t just another technology wave; it’s becoming an everyday utility. For enterprises, that’s precisely how it should be.

(image/jpeg; 19.58 MB)

LLMs aren’t enough for real-world, real-time projects 24 Jun 2025, 5:00 am

The major builders of large language models (LLMs)—OpenAI, DeepSeek, and others—are mistaken when they claim that their latest models, like OpenAI’s o-series or DeepSeek’s R1, can now “reason.” What they’re offering isn’t reasoning. It’s simply an advanced text predictor with some added features. To unlock AI’s true transformative potential, we need to move beyond the idea of reasoning as a one-size-fits-all solution. Here’s why.

If 2024 belonged to ChatGPT, OpenAI hoped it would dominate 2025 with the o-series, promising a leap in LLM reasoning. Early praise for its attempts to curb hallucinations quickly faded when China’s DeepSeek matched its capabilities at a fraction of the cost—on a laptop. Then came Doubao, an even cheaper rival, shaking the AI landscape. Chip stocks dropped, US tech dominance faltered, and even Anthropic’s Claude 3.5 Sonnet came under scrutiny.

But the real issue with the LLM paradigm isn’t just cost—it’s the illusion that all its inherent flaws have been solved. And that’s a dangerous path that could lead to painful dead ends. Despite all the progress, issues like hallucination remain unresolved. This is why I believe the future of AI doesn’t lie in artificial general intelligence (AGI) or endlessly scaling LLMs. Instead, it’s in fusing LLMs with knowledge graphs—particularly when enhanced by retrieval-augmented generation (RAG), combining the power of structured data with generative AI models.

No matter how cheap or efficient, an LLM is fundamentally a fixed, pre-trained model, and retraining it is always costly and impractical. In contrast, knowledge graphs are dynamic, evolving networks of meaning, offering a far more adaptable and reliable foundation for reasoning. Enriching an LLM’s conceptual map with structured, interconnected data through graphs transforms it from probabilistic guesswork into precision. This hybrid approach enables true practical reasoning, offering a dependable way to tackle complex enterprise challenges with clarity—something that LLM “reasoning” often falls short of delivering.

We need to distinguish between true reasoning and the tricks LLMs use to simulate it. Model makers are loading their latest models with shortcuts. Take OpenAI, for example, which now injects code when a model detects a calculation in the context window, creating the illusion of reasoning through stagecraft rather than intelligence. But these tricks don’t solve the core problem: the model doesn’t understand what it’s doing. While today’s LLMs have solved classic logic fails—like struggling to determine how long it would take to dry 30 vs. five white shirts in the sun—there will always be countless other logical gaps. The difference is that graphs provide a structured and deep foundation for reasoning, not masking limitations with clever tricks.

The limits of LLM ‘reasoning’

We’ve seen the consequences of forcing ChatGPT into this role, where it fabricates confident but unreliable answers or risks exposing proprietary data to train itself—a fundamental flaw. Tasks like predicting financial trends, managing supply chains, or analyzing domain-specific data require more than surface-level reasoning.

Take financial fraud detection, for example. An LLM might be asked, “Does this transaction look suspicious?” and respond with something that sounds confident—“Yes, because it resembles known fraudulent patterns.” But does it actually understand the relationships between accounts, historical behavior, or hidden transaction loops? No. It’s simply echoing probability-weighted phrases from its training data. True fraud detection requires structured reasoning over financial networks buried within your transaction data—something LLMs alone cannot provide.

The problem becomes even more concerning when we consider the deployment of LLMs in real-world applications. Take, for example, a company using an LLM to summarize clinical trial results or predict drug interactions. The model might generate a response like, “This combination of compounds has shown a 30% increase in efficacy.” But what if those trials weren’t conducted together, if critical side effects are overlooked, or if regulatory constraints are ignored? The consequences could be severe.

Now, consider cybersecurity, another domain where a wrong response could have catastrophic consequences. Imagine your CSO asking an LLM, “How should we respond to this network breach?” The model might suggest actions that sound plausible but are completely misaligned with the organization’s actual infrastructure, latest threat intelligence, or compliance needs. Following AI-generated cybersecurity advice without scrutiny could leave the company even more vulnerable.

And let’s not overlook enterprise risk management. Suppose a group of business users asks an LLM, “What are the biggest financial risks for our business next year?” The model might confidently generate an answer based on past economic downturns. However, it lacks real-time awareness of macroeconomic shifts, government regulations, or industry-specific risks. It also lacks the current and actual corporate information—it simply does not have it. Without structured reasoning and real-time data integration, the response, while grammatically perfect, is little more than educated guessing dressed up as insight.

This is why structured, verifiable data are absolutely essential in enterprise AI. LLMs can offer useful insights, but without a real reasoning layer—such as knowledge graphs and graph-based retrieval—they’re essentially flying blind. The goal isn’t just for AI to generate answers, but to ensure it comprehends the relationships, logic, and real-world constraints behind those answers.

The power of knowledge graphs

The reality is that business users need models that provide accurate, explainable answers while operating securely within the walled garden of their corporate infosphere. Consider the training problem: A firm signs a major LLM contract, but unless it gets a private model, the LLM won’t fully grasp the organization’s domain without extensive training. And once new data arrives, that training is outdated—forcing another costly retraining cycle. This is plainly impractical, no matter how customized the o1, o2, o3, or o4 model is.

In sharp contrast, supplementing an LLM with a well-designed knowledge graph—especially one that employs dynamic algorithms—solves this issue by updating context rather than requiring retraining. Whereas an LLM like o1 might correctly interpret a question like “How many x?” as a sum, we need it to understand something more specific, such as “How many servers are in our AWS account?” That’s a database look-up, not an abstract mathematical question.

A knowledge graph ensures that a first attempt at practical AI can reason over your data with reliability. Moreover, with a graph-based approach, LLMs can be used securely with private data—something even the best LLM on its own can’t manage.

The smart move is to go beyond the trivial. AI needs knowledge graphs, retrieval-augmented generation, and advanced retrieval methods like vector search and graph algorithms—not just low-cost training models, impressive as they may seem.

Dominik Tomicevic leads European software company Memgraph, provider of an open-source in-memory graph database that’s purpose-built for dynamic, real-time enterprise applications.

Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.

(image/jpeg; 8.4 MB)

Google’s Agent2Agent project moves to Linux Foundation 23 Jun 2025, 10:48 pm

The Linux Foundation is the new home of the Agent2Agent (A2A) protocol, an open protocol developed by Google to enable agentic AI interoperability and trusted agent communication across systems and platforms.

Launched by Google in April, the A2A protocol addresses the need for agents to operate in dynamic, multi-agent environments. A2A enables autonomous agents to discover one another, exchange information securely, and collaborate across systems, which in turn allows developers to unite agents from multiple sources and platforms, improving modularity, mitigating vendor lock-in, and accelerating innovation, the Linux Foundation said in a June 23 announcement. Developers can go to the A2A repository on GitHub to learn more about the protocol and follow the progress of the project.

The A2A project is being formed with participation from Amazon Web Services, Cisco, Google, Microsoft, Salesforce, SAP, and ServiceNow, Google said in a blog post, also dated June 23. Under Linux Foundation governance, A2A will remain vendor-neutral, emphasize inclusive contributions, and continue the protocol’s focus on extensibility, security, and real-world usability, the Linux Foundation said. “By joining the Linux Foundation, A2A is ensuring the long-term neutrality, collaboration, and governance that will unlock the next era of agent-to-agent powered productivity,” said Jim Zemlin, executive director of the Linux Foundation.

“The Agent2Agent protocol establishes a vital open standard for communication, enabling the industry to build truly interoperable AI agents across diverse platforms and systems,” said Rao Surapaneni, vice president and GM of Business Applications Platform, Google Cloud. “By collaborating with the Linux Foundation and leading technology providers, we will enable more innovative and valuable AI capabilities under a trusted, open-governance framework.”

(image/jpeg; 1.95 MB)

Ktor adds dependency injection and HTMX modules 23 Jun 2025, 6:08 pm

JetBrains has released Ktor 3.2.0, an update to the Kotlin-based framework for building asynchronous applications that brings modules for dependency injection and HTMX and automatic deserialization of configuration files into data classes, among other new capabilities.

Unveiled June 19, Ktor 3.2.0 also offers tools updates and performance improvements for different platforms. Instructions for getting started with Ktor can be found at ktor.io.

A dependency injection (DI) module featured in Ktor 3.2.0, while optional, allows Ktor to offer additional functionality out of the box for dependency injection users. Ktor DI is built on top of coroutines, which allow for the concurrent initialization of an application. Easy integration with existing DI frameworks is enabled by Ktor DI, according to JetBrains. Also, Ktor DI automatically closes AutoCloseable instances or allows for developers to configure their own cleanup handlers.

Ktor’s new HTMX module includes tight integration with kotlinx.html, which provides a domain specific language (DSL) for HTML, and the Ktor Routing DSL. This enables developers to more easily define HTML attributes for HTMX and define routes that automatically include HTMX headers.

For typed configuration, Ktor 3.2.0 now automatically deserializes configuration files into data classes in addition to primitive types. In order to deserialize structured data, the developer first needs to define a data class that matches their YAML configuration file.

Also in Ktor 3.2.0:

  • Ktor now supports suspend, or asynchronous modules, making it possible to await dependencies requiring suspension for initialization. Developers also can parallelize complex dependency graphs.
  • Ktor now supports Gradle version catalogs.
  • Ktor’s CIO client and server engine now support Unix domain sockets, thus providing more efficient bidirectional communication between processes on the same system.
  • A known regression in Ktor 3.2.0 pertaining to Android R8 will be fixed in Ktor 3.2.1, JetBrains said

Ktor enables development of asynchronous client and server applications. Developers can build applications ranging from microservices to multiplatform HTTP client apps with ease, JetBrains said.

(image/jpeg; 0.2 MB)

Agentic AI won’t wait for your data architecture to catch up 23 Jun 2025, 11:53 am

A decade ago, the cloud ignited a massive replatforming of application and server infrastructure. Open-source technologies like Docker and Kubernetes transformed software velocity and operational flexibility, launching a new era.

But it didn’t happen overnight. Enterprises had to adapt to shifting foundations, talent gaps, and an open-source ecosystem evolving faster than most teams could absorb.

Today, agentic AI is catalyzing a similar, profound replatforming. This shift centers on real-time data interaction, where success is measured in milliseconds, not minutes. What’s at stake is your company’s ability to thrive in new marketplaces shaped by intelligent systems.

To navigate this transition, here are key considerations for preparing your data infrastructure for agentic AI.

The AI data layer must serve polyglot, multi-persona teams

Traditional data platforms, which primarily served SQL analysts and data engineers, are no longer sufficient. Today’s AI landscape demands real-time access for a vastly expanded audience: machine learning engineers, developers, product teams, and crucially, automated agents – all needing to work with data in tools like Python, Java, and SQL.

Much as Docker and Kubernetes revolutionized cloud-native application development, Apache Iceberg has become the foundational open-source technology for this modern AI data infrastructure. Iceberg provides a transactional format for evolving schemas, time travel, and high-concurrency access.

Combined with a powerful and scalable serverless data platform, this enables real-time dataflows for unpredictable, agent-driven workloads with strict latency needs.

Together, these technologies enable fluid collaboration across diverse roles and systems. They empower intelligent agents to move beyond mere observation, allowing them to act safely and quickly within dynamic data environments.

Your biggest challenge? “Day two” operations

The greatest challenge in building data infrastructure for agentic AI lies not in technology selection, but in operationalizing it effectively.

It’s not about choosing the perfect table format or stream processor; it’s about making those components reliable, cost-efficient, and secure under high-stakes workloads. These workloads require constant interaction and unpredictable triggers.

Common challenges include:

  • Lineage and compliance: Tracking data origins, managing changes, and supporting deletion for regulations like GDPR are complex and crucial.
  • Resource efficiency: Without smart provisioning, GPU and TPU costs can quickly escalate. Managed cloud offerings for OSS products help by abstracting compute management.
  • Access control and security: Misconfigured permissions present a significant risk. Overly broad access can easily lead to critical data being exposed.
  • Discovery and context: Even with tools like Iceberg, teams struggle to find the metadata needed for just-in-time dataset access.
  • Ease of use: Managing modern data tools can burden teams with unnecessary complexity. Simplifying workflows for developers, analysts, and agents is essential to keep productivity high and barriers low.

Without robust operational readiness, even the best-architected platforms will struggle under the constant pressure of agentic AI’s decision loops.

The right balance between open source and cloud partners

Complex infrastructure is now driven by open-source innovation, especially in data infrastructure. Here, open-source communities often pioneer solutions for advanced use cases, far exceeding the typical operational capacity of most data teams.

The biggest gaps arise when scaling open-source tools for high-volume ingestion, streaming joins, and just-in-time compute. Most organizations struggle with fragile pipelines, escalating costs, and legacy systems ill-suited to agentic AI’s real-time demands.

These are all areas where cloud providers with significant operational depth deliver critical value.

The goal is to combine open standards with cloud infrastructure that automates the most arduous tasks, from data lineage to resource provisioning. By building on open standards, organizations can effectively mitigate vendor lock-in. At the same time, partnering with cloud providers who actively contribute to these ecosystems and offer essential operational guardrails in their services enables faster deployment and greater reliability. This approach is superior to building fragile, ad-hoc pipelines or depending on opaque proprietary platforms.

For example, Google Cloud’s Iceberg integration in BigQuery combines open formats with highly scalable, real-time metadata offering high throughput streaming, automated table management, performance optimizations, integrations with Vertex AI for agentic applications.

Ultimately, your goal is to accelerate innovation while mitigating the inherent risks of managing complex data infrastructure alone.

The agentic AI skills gap is real

Even the largest companies are grappling with a shortage of talent to design, secure, and operate AI-ready data platforms. The most acute hiring challenge isn’t just data engineering, it’s also real-time systems engineering at scale.

Agentic AI amplifies operational demands and pace of change. It requires platforms that support dynamic collaboration, robust governance, and instantaneous interaction. These systems must simplify operations without compromising reliability.

Agentic AI marketplaces may prove even more disruptive than the Internet. If your data architecture isn’t built for real-time, open, and scalable use, the time to act is now. Learn more about advanced Apache Iceberg and data lakehouse capabilities here

(image/jpeg; 6.88 MB)

GitHub’s AI billing shift signals the end of free enterprise tools era 23 Jun 2025, 9:05 am

GitHub began enforcing monthly limits on its most powerful AI coding models this week, marking the latest example of AI companies transitioning users from free or unlimited services to paid subscription tiers once adoption takes hold.

“Monthly premium request allowances for paid GitHub Copilot users are now in effect,” the company said in its update to the Copilot consumptive billing experience, confirming that billing for additional requests now starts at $0.04 each. The enforcement represents the activation of restrictions first announced by GitHub CEO Thomas Dohmke in April.

The move affects users of GitHub’s most advanced AI models, including Anthropic’s Claude 3.5 and 3.7 Sonnet, Google’s Gemini 2.0 Flash, and OpenAI’s o3-mini. Users who exceed their monthly allowances must now either wait until the next billing cycle or enable pay-per-request billing to continue using premium features, the blog post added.

Premium request limits by plan

The enforcement creates tiered access to advanced AI capabilities. Customers with Copilot Pro will receive 300 monthly premium requests, while Copilot Business and Enterprise users will get 300 and 1,000 requests, respectively. GitHub will also offer a Pro+ plan at $39 per month, providing 1,500 premium requests and access to what the company describes as “the best models, like GPT-4.5.”

Each model consumes premium requests based on a multiplier system designed to reflect computational costs. GPT-4.5 has a 50x multiplier, meaning one interaction counts as 50 premium requests, while Google’s Gemini 2.0 Flash uses only 0.25x. Users can still make unlimited requests using GitHub’s base model GPT-4o, though rate limiting applies during high demand.

For those exceeding monthly allowances, GitHub’s current billing system will require users to “set a spending limit in your billing settings” with “the default limit set to $0,” meaning additional requests are rejected unless explicitly authorized.

April announcement set the stage

When Dohmke first announced the premium request system in April, he positioned the restrictions as necessary infrastructure for sustainable AI services. “Since GitHub Universe, we introduced a number of new models for chat, multi-file edits, and now agent mode. With the general availability of these models, we are introducing a new premium request type,” the company had said.

The company delayed implementation in May, stating: “We’re delaying the enforcement of Copilot premium request limits. Our goal is to make it easy for you to see how many premium requests you’re using and give you control over your limits and potential expenses.”

Developer backlash mirrors industry pattern

For enterprise customers, these changes signal the maturation of AI tools from experimental technologies to essential business services requiring budget planning and strategic procurement. GitHub’s current billing system allows organizations to “monitor your premium request usage in real-time from the Copilot status icon in your IDE or download detailed usage reports,” but any premium requests beyond monthly allowances are rejected unless administrators explicitly enable additional billing.

The enforcement has also prompted complaints from GitHub Copilot users in online forums, with community discussions showing almost all the comments posted over the past week argue that the limits are far too low and appear to be designed to force customers to upgrade to more expensive subscription plans.

“300 per day is ok, per month is ridiculous,” wrote one user on the forum page.

The criticism follows a familiar pattern across the AI industry as services mature from startup offerings to profitable enterprises. This shift has materialized across different categories of AI services, creating a consistent pattern of reduced free access as platforms establish market presence.

Midjourney exemplifies this trend most clearly. The popular AI image creator initially offered 25 free images to new users when it launched in July 2022, but by June 2023 eliminated free trials entirely, requiring paid subscriptions starting at $10 monthly. Video generation platform Runway AI structures its offering around a credit system where the free tier provides only “a one-time deposit of 125 credits,” while paid plans starting at $15 monthly offer renewable credit allowances that “do not roll over to following months.”

Conversational AI services have implemented similar restrictions. Anthropic’s Claude imposes daily message limits on free users, typically allowing 40-50 messages per day, while ChatGPT’s free tier restricts users to older GPT-3.5 models with access limitations during peak usage periods.

Revenue pressures drive monetization

The monetization trend reflects mounting pressure on AI companies to demonstrate sustainable business models. According to TechCrunch, Microsoft CEO Satya Nadella said last August that Copilot accounted for over 40% of GitHub’s revenue growth in 2024 and is already “a larger business than all of GitHub when the tech giant acquired it roughly seven years ago.”

Training and operating advanced language models require substantial computational resources, with leading AI companies spending hundreds of millions of dollars on infrastructure. As venture capital funding becomes more selective and investors demand clearer paths to profitability, AI companies increasingly rely on subscription revenue rather than pursuing unsustainable growth strategies.

GitHub’s approach reflects this broader recalibration. “Premium requests are in addition to the unlimited requests for agent mode, context-driven chat, and code completions that all paid plans have when using our base model,” the company emphasized in its April announcement, positioning the changes as value-added services rather than restrictions on core functionality.

The trend suggests that CIOs and technology leaders should prepare for similar changes across their AI tool portfolios. As these services transition from venture-capital-subsidized offerings to self-sustaining businesses, organizations may need to reevaluate their AI strategies and budget allocations accordingly.

More GitHub news:

(image/jpeg; 0.45 MB)

Devops debt: The hidden tax on innovation 23 Jun 2025, 5:00 am

Your devops teams are likely wasting half their time on work that delivers zero business value. According to our 2025 State of Java Survey & Report, 62% of organizations report that dead code is hampering their devops productivity, while 33% admit their teams waste more than half their time chasing false-positive security alerts. Meanwhile, 72% of companies are paying for cloud capacity they never use.

This isn’t just inefficiency—it’s a hidden tax on innovation that’s silently killing your ability to compete. In my 25+ years working with Java, from JDK 1.0 in 1996 to today, I’ve witnessed how these mounting inefficiencies have become the single biggest barrier to innovation for Java-based enterprises. And with nearly 70% of companies reporting that more than half their applications run on Java, this isn’t a niche problem—it’s a crisis hiding in plain sight.

The three pillars of devops debt

Code bloat: The growing burden of digital hoarding

Dead code—portions of your code base that are never executed but still sit in production—creates a cascade of problems that extend far beyond wasted storage. The productivity impact only hints at the deeper issue: This digital hoarding forces developers to navigate unnecessarily complex systems. Our research reveals that organizations with high levels of dead code report development cycles that are, on average, 35% longer than those with streamlined code bases.

This problem compounds over time as Java versions become dated and obsolete. For example, 10% of organizations still run applications on Java 6, a 20-year-old version that Oracle ceased providing updates for in December 2018.

Security false positives: The endless chase

Beyond wasted development time, security false positives consume an enormous amount of devops resources. The “better safe than sorry” approach to security scanning has led to alert fatigue, with one-third of teams spending the majority of their time investigating issues that turn out to be non-threats.

The problem is particularly acute in Java environments, where 41% of organizations encounter critical production security issues on a weekly or daily basis. Despite having had more than three years to address Log4j, half of the companies surveyed are still experiencing security vulnerabilities from Log4j in production. This persistent vulnerability highlights a broader challenge: distinguishing between theoretical vulnerabilities and actual threats.

Our research indicates that approximately 70% of security alerts in Java environments are ultimately determined to be false positives or vulnerabilities in code paths that are never executed in production. When devops teams can’t efficiently separate real threats from hypothetical ones, innovation inevitably grinds to a halt.

Cloud waste: Paying for idle capacity

The financial dimension of devops debt manifests in cloud resource inefficiency. Beyond the headline figure of widespread waste, we’ve found that many organizations are dramatically over-provisioning their Java applications due to uncertainty about performance requirements and inconsistent load patterns.

For Java-based organizations, this problem is particularly significant because nearly two-thirds report that more than 50% of their cloud compute costs stem from Java workloads. Additional analysis shows that optimizing Java Virtual Machine (JVM) configurations alone could reduce cloud costs by 25% to 30% for the average enterprise.

This waste essentially functions as a direct financial penalty—you’re literally paying for capacity you don’t use, just like interest on a financial debt. Across the enterprise landscape, we estimate this represents over $10 billion in annual wasted cloud spending.

Breaking free of devops debt

As Java applications continue to modernize, with nearly half of organizations (49%) now running either Java 17 or Java 21, this transition creates a perfect opportunity to address these underlying inefficiencies.

Code hygiene automation

Implement automated tools that identify and safely remove dead code, integrated directly into your CI/CD pipeline to prevent new accumulation. Just as we continuously monitor JVM performance metrics, apply the same rigor to identifying unused code patterns.

Leading organizations are now incorporating runtime usage analysis to identify code paths that haven’t been executed in production for extended periods. This data-driven approach has helped some enterprises reduce their code bases by up to 40% without any functional impact.

Consider implementing policies that require deprecated code to have sunset dates, ensuring that temporary workarounds don’t become permanent technical debt. Regular code reviews focused specifically on identifying unused components can help keep your code base lean and maintainable.

Runtime intelligence for security

Traditional security scanning produces too many alerts with too little context. Modern approaches incorporate runtime intelligence to prioritize vulnerabilities based on actual usage patterns rather than theoretical exploitability.

Organizations should invest in tools that distinguish between code paths actually executed in production versus those that exist but aren’t used. This runtime intelligence approach transforms security from theoretical vulnerability hunting to practical risk management, dramatically reducing false positives and freeing your teams to focus on innovation.

Companies that have adopted this approach report up to an 80% reduction in security alert volume while actually improving their security posture by focusing resources on genuinely exploitable vulnerabilities.

Resource optimization

Adopt tools and practices that optimize cloud resource allocation through advanced auto-scaling, high-performance JDKs, and established finops practices that align technology with business objectives.

Our report shows that forward-thinking organizations are already addressing this: 38% have implemented new internal rules for cloud instance usage, 35% are using more efficient compute instances and processors, and 24% have adopted high-performance JDKs specifically to enhance performance and reduce costs.

The most successful organizations are implementing cross-functional finops teams with representation from engineering, operations, and finance to holistically address resource optimization. These teams establish metrics and governance processes that balance innovation speed with cost efficiency.

The innovation imperative

The cost of devops debt goes far beyond wasted engineering hours. When teams spend half their time managing false positives and navigating bloated code bases, your competitors who’ve addressed these issues can innovate twice as fast. Top developers seek environments where they can create value, not manage legacy messes. Every hour spent on activities that don’t add value represents features not built, customer needs not addressed, and market opportunities missed.

Just as we’ve seen organizations seek alternatives to inefficient Java deployments, I predict we’ll see a similar movement toward addressing devops debt as awareness of its costs grows. The organizations that move first will gain significant competitive advantage.

The question isn’t whether you have devops debt—it’s whether you’ll start paying it down before your competitors do. The tools and practices exist today to dramatically reduce these inefficiencies. Those who act now won’t just improve their engineering productivity; they’ll fundamentally transform their ability to innovate in an increasingly competitive marketplace.

Simon Ritter is deputy CTO of Azul.

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 10.87 MB)

How to succeed (or fail) with AI-driven development 23 Jun 2025, 5:00 am

Artificial intelligence (AI) continues to permeate seemingly every aspect of business, including software development. AI-augmented development involves using generative AI to support various stages of the software development lifecycle, including design, testing, and deployment. Introducing AI-powered tools into the development process is intended to increase developer productivity by automating certain tasks. It can also enhance the quality of code and speed up the development lifecycle, so development teams can bring products to users more quickly.

AI-augmented development is on the rise, according to industry research. An May 2025 report by market intelligence and advisory firm QKS Group forecasts that the global AI-augmented software development market will expand at a compound annual growth rate of 33 percent through 2030.

“In an era where speed, innovation, and adaptability define competitive advantage, AI-augmented software development is rapidly becoming a transformative force for enterprises,” the report says. “By embedding AI into every stage of the software development lifecycle, from code generation and testing to debugging and deployment, organizations across industries like finance, healthcare, retail, telecom, and manufacturing are redefining how software is built, optimized, and scaled.”

Deploying AI-augmented development tools and processes comes with both risks and rewards. For tech leaders and software developers, it is vital to understand both.

Risks of AI-augmented software development

Risks of relying too heavily on AI for software development include bias in the data used to train models, cybersecurity threats, and unchecked errors in AI-generated code. We asked a range of experts what they’ve found most challenging about integrating AI in the software development lifecycle and how they’ve managed those challenges.

Bias in the models

Bias in the data used to feed models has long been an issue for AI, and AI-augmented development is no exception.

“Because AI is trained on human-coded data, it can replicate and amplify existing biases,” says Ja-Naé Duane, faculty and academic director of the Master’s Program in Innovation Management and Entrepreneurship at Brown University School of Engineering. “Without deliberate oversight and diverse perspectives in design and testing, we risk embedding exclusion into the systems we build,” she says.

Most Loved Workplace, a provider of workplace certifications, uses machine learning to analyze employee sentiment. But early on, it saw signs that its models were misreading certain emotional tones or cultural language differences.

“We had to retrain the models, labeling according to our own researched models, and using humans in the loop to test for bias,” says Louis Carter, founder of the company and an organizational psychologist.

“Our internal team did a lot of work to do so, and we created a gaming platform for everyone to label and add in their own interpretation of bias,” Carter says. “We improved the [BERT language model], developing our own construct for identifying emotions and sentiment. If we hadn’t caught it, the results would have misled users and hurt the product’s credibility.”

Intellectual property (IP) infringement

The use of AI-augmented development and possible IP infringement can raise complex legal issues, especially within the area of copyright. Because AI models can be trained using enormous datasets, including some copyrighted content, they can generate outputs that closely resemble or infringe upon existing copyrighted material. This can lead to lawsuits.

“The current uncertainty around how these models do or don’t infringe on intellectual property rights is absolutely still a risk,” says Joseph Mudrak, a software engineer at product design company Priority Designs. “OpenAI and Meta, for example, are both subjects of ongoing court cases regarding the sources of the data fed into those models.”

The American Bar Association notes that as the use of generative AI grows rapidly, “so have cases brought against generative AI tools for infringement of copyright and other intellectual property rights, which may establish notable legal precedents in this area.”

“Most generally available AI-augmented development systems are trained on large swaths of data, and it’s not particularly clear where that data comes from,” says Kirk Sigmon, a partner at law firm Banner & Witcoff Ltd. Sigmon specializes in AI and does coding and development work on the side. “Code is protectable by copyright, meaning that it is very possible that AI-augmented development systems could output copyright-infringing code,” Sigmon says.

Cybersecurity issues

AI-augmented development introduces potential cybersecurity risks such as insecure code generation. If they are trained on datasets with flawed or insecure examples, AI models can generate code containing common vulnerabilities such as SQL injection or cross-site scripting attacks.

AI-generated code could also inadvertently include sensitive data such as customer information or user passwords, exposing it to potential attackers. Training models on sensitive data might lead to unintentional exposure of this data in the generated code.

“From a privacy and cybersecurity standpoint, unvalidated AI-generated code can introduce serious vulnerabilities into the software supply chain,” says Maryam Meseha, founding partner and co-chair of privacy and data protection at law firm Pierson Ferdinand LLP.

“We’ve seen companies unknowingly ship features that carried embedded security flaws, simply because the code ‘looked right’ or passed surface-level tests,” Meseha says. “The cost of retroactively fixing these issues, or worse, dealing with a data breach, far outweighs the initial speed gains.”

False confidence

There might be a tendency for development teams and leaders to assume that AI will get it right almost all the time because they believe automation removes the problem of human error. This false confidence can lead to problems.

“AI-augmented approaches, particularly those using generative AI, are inherently prone to mistakes,” says Ipek Ozkaya, technical director of engineering intelligent software systems at the Carnegie Mellon University Software Engineering Institute.

“If AI-augmented software development workflows are not designed to prevent, recognize, correct, and account for these mistakes, they are likely to become nightmares down the line, amounting to unmanageable technical debt,” Ozkaya says.

Most Loved Workplace, which uses tools such as Claude Code, Sentry, and custom AI models for emotion and sentiment analysis in its platform, has experienced false confidence with AI-augmented development.

“Claude and other tools sound right even when they’re dead wrong,” Carter says. “One piece of output missed a major edge case in a logic loop. It passed initial testing but broke once real users hit it. Now, everything AI touches goes through multiple human checks.”

The company has had developers submit code from Claude that looked solid at first but failed under load, Carter says. “When I asked why they made certain choices, they couldn’t explain it—it came straight from the tool,” he says. “Since then, we’ve made it clear: If you can’t explain it, don’t ship it.”

Rewards of AI-augmented software development

While increased productivity and cost-effectiveness garner the most attention from business leaders, tech leaders and developers are finding that AI supports developer learning and skills development, prevents burnout, and may make software development more sustainable as a career.

Speed without burnout

It’s no surprise, given the pressure to deliver quality software at a rapid pace, that many developers experience burnout. A 2024 study by Kickstand Research, based on a survey of more than 600 full-time professionals in software engineering, found that nearly two-thirds of respondents (65 percent) experienced burnout in the past year.

The report, conducted on behalf of Jellyfish, a provider of an engineering management platform, indicated that the problem was particularly acute for short-staffed engineers and leaders overseeing large organizations. Of respondents at companies with more than 500 people in their engineering organization, 85 percent of managers and 92 percent of executives said they were experiencing burnout.

Deploying AI-augmented development tools can help address the issue by automating tasks and increasing productivity.

“Claude Code has helped us move faster without overwhelming the team,” Carter says. “One of our junior developers hit a wall building a complex rules engine. He used Claude to map out the logic and get unstuck. What would’ve taken half a day took about an hour. It saved time and boosted his confidence.”

Cleaner code and fewer bugs

AI-augmented development can lead to fewer bugs and improved code quality. This is because AI tools can handle tasks such as code analysis, bug detection, and automated testing. They can help identify possible errors and suggest enhancements.

“We use Sentry to catch issues early, and Claude to clean up and comment the code before anything ships,” Carter says. “Claude is a great way of cleaning up messy code.”

Commenting, or adding notes and reasoning behind what code is doing and what it is intended to accomplish, makes it easy for everyone to understand, Carter says. This is especially helpful for programmers whose second language is English, “because there are a lot of misunderstandings that can happen.”

Most Loved Workplace is running sentiment and emotion scoring in its human resources SaaS application Workplacely, used for certifying companies. “AI helps us test edge cases faster and flag inconsistencies in model outputs before they go live,” Carter says.

“My favorite way to use AI-augmented development systems is to use them to help me bugfix,” Sigmon says. “AI systems have already saved me a few times when, late at night, I struggled to find some small typo in code, or struggled to figure out some complex interrelationship between different signaling systems.”

Cost-effectiveness and increased productivity

AI-augmented development systems can be cost-effective, particularly over time due to increased efficiency and productivity, the automation of tasks, reduced errors, and shorter development lifecycles.

“Using AI-augmented development systems can save money because you can hire fewer developers,” Sigmon says. “That said, it comes with some caveats. For instance, if the world pivots to only hiring senior developers and relies on AI for ‘easy’ work, then we’ll never have the opportunity to train junior developers to become those senior developers in the future.”

AI “can automate routine coding tasks and surface bugs, as well as optimize performance, dramatically reducing development time and cost,” Duane says.

“For example, tools like GitHub Copilot have been shown to significantly cut time-to-deploy by offering developers real-time code suggestions,” Duane says. “In several organizations I work with, teams have reported up to a 35 percent acceleration in release cycles, allowing them to move from planning to prototyping at unprecedented speed.”

Upskilling on the fly

The skills shortage is one of the biggest hurdles for organizations and their development operations. AI-powered tools can help developers learn new skills organically in the development process.

“I’ve seen junior team members start thinking like senior engineers much faster,” Carter says. “One in particular used to lean on me for direction constantly. Now, with Claude, he tests ideas, reviews structure, and comes to me with smarter questions. It’s changed how we work.”

AI is lowering the barrier to entry for individuals without formal programming training by enabling no-code and low-code platforms, Duane says. “This transformation aligns with our vision of inclusive innovation ecosystems,” she says.

For instance, platforms such as Bubble and Zapier enable entrepreneurs, educators, and others without technical backgrounds to build and automate without writing a single line of code, Duane says. “As a result, millions of new voices can now participate in shaping digital solutions, voices that would have previously been left out,” she says.

Further reading:

(image/jpeg; 2.47 MB)

Page processed in 0.186 seconds.

Powered by SimplePie 1.3.1, Build 20121030175403. Run the SimplePie Compatibility Test. SimplePie is © 2004–2025, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.