Peer
Deep with DTrace
Sun Microsystems, Inc.
Track, tune, and troubleshoot your systems in real time with Sun's
new dynamic tracing framework, part of the SolarisTM
10 OS.
18.May.04 -- Imagine if any question you had about your systems could
be magically answered -- instantly. Imagine how much easier it would
be to find system bottlenecks or understand complicated performance
issues. That's the dramatic effect dynamic tracing, or DTrace, a comprehensive
dynamic tracing framework for the Solaris Operating System (Solaris
OS), can have on your data center.
DTrace is one of several revolutionary new technologies found
in the Solaris 10 OS, which is available for preview now through
the Sun Software Express Program. Built into the Solaris OS with
25,000 probes in the kernel alone, DTrace is a boon to developers,
system administrators, and IT managers.
More powerful than any other tool in the industry, DTrace is an
unmatched dynamic tracing framework for troubleshooting your network
and tuning system performance -- in real time. DTrace lets you see
your entire Solaris OS system in an entirely new way, revealing
systemic problems that were previously invisible and fixing performance
issues that used to go unresolved. With DTrace, you can:
- Examine the behavior of user programs and the Solaris OS and
quickly identify the root causes of system and application bottlenecks
- Highlight trends and patterns to tune systems for best performance
- Track down performance problems across many layers of software
- Locate the cause of aberrant behavior
- Write reusable scripts for common or complex routines
- Specify the data DTrace collects, the actions it takes, and
the conditions under which it should take those actions
Across-the-Board Return on Investment (ROI)
"By providing a thorough understanding of your systems' behavior,
DTrace can lead to phenomenal network speed gains, slashed support
costs, and exceptionally effective tuning," says Greg Papadopoulos,
Sun Chief Technical Officer. "Simply put, DTrace is one of the most
significant innovations in operating systems in the last decade."
DTrace can help you pare your IT budget in several key ways:
- Performance problems can be understood on production machines;
there's no need to waste time and money reproducing them in a
separate test bed.
- Bottlenecks can be identified and fixed in minutes or hours
instead of days.
- Existing systems can handle more users or transactions.
- Service availability can be improved.
For example, when a server at Sun experienced degraded performance,
DTrace found and isolated a rogue application in just 20 minutes --
a task that might have taken 30 hours before DTrace. That Sun server
now supports 30 percent more desktops.
"The wins from DTrace can be so significant that orders-of-magnitude
performance increases are realized in production. This immediately
translates to a bottom line benefit for the business unit," says
Jarod Jenson, chief systems architect of Aeysis, a performance consultancy
in Houston.
The Mother of Invention
In 1997, Sun's Bryan Cantrill, now a senior staff engineer in
Solaris Kernel Development, and his team were working feverishly
on a performance problem that cropped up in the just-introduced
Sun Enterprise 10000 server. While running a benchmark, the server
mysteriously slowed down for a period of time. Six sleepless days
later, the team finally discovered the problem's root cause. A "totally
knuckle-headed" configuration mistake had misconfigured the server
to act as a router.
"I came away shaken," declares Cantrill. "This was a problem that
any customer could have, but they wouldn't have the luxury of kernel
developers working around-the-clock writing custom code to understand
the problem. We had to find a better way." After two and a half
years of intense development, Cantrill and his team built that better
way: DTrace.
With DTrace, Sun is making available several innovative features not
found in other tracing software:
- You can safely use DTrace on production machines, as well as
in development and test bed systems, in real time.
- DTrace provides a single view of the software stack, from kernel
to application.
- You do not have to modify applications -- or even restart them
-- before putting DTrace into action.
- DTrace fully instruments the operating system.
"There's a class of problems for which you spend most of your time
theorizing what might be happening and then trying to use the available
monitoring tools to prove or disprove those theories," explains
Philip Beevers, a developer at Surrey, U.K.-based royalblue, a leading
supplier of global financial trading software.
"This is true particularly for performance problems in complex production
environments, which you can't easily reproduce and where traditionally
you can't add your own instrumentation. With DTrace, developers
can create tools which are tailored to proving or disproving those
theories."
Boon for Developers
Developers can use DTrace to analyze and optimize application
performance. DTrace makes testing and tuning more effective, with
shorter test cycles. That yields lower support costs.
For example, when a financial institution applied DTrace to one
of its business-critical applications, it uncovered a serious scalability
problem. The institution was able to fix the problem in less than
a day and netted "more than a 10 times throughput increase," according
to Cantrill.
In another first in the industry, DTrace lets programmers see
the interaction between their applications and the kernel by observing
the flow of control across the user/kernel boundary. And with DTrace's
easy-to-learn D language, you can build custom programs to dynamically
instrument the system and provide immediate, concise answers to
arbitrary questions about the operating system and user programs.
System administrators can use DTrace in real time on a production
system because the system cannot be accidentally disrupted. While
active, DTrace only minimally affects the system by dynamically
selecting just the probe points you need. To further minimize its
impact, DTrace never requires a reboot, forced failure, special
diagnostic mode, or other changes to your system, applications,
or user accounts. Because you can resolve problems more quickly
than was ever before possible, you'll endure fewer and shorter service
interruptions.
"[An] Oracle [server] was eating CPU under a low load, and it was
very difficult to determine why," says Peter Baer Galvin, chief
technologist of Corporate Technologies, an enterprise systems integrator
in Burlington, Mass. "After a lot of debugging and experimenting
on [the] Solaris 8 [OS] without DTrace, we found that the problem
was actually the application server that was calling the database
server. With DTrace, this probably could have been solved in an
hour, rather than in a week."
No Comparison
Every major UNIX vendor, as well as Microsoft, offers some form
of tracing, but no method stacks up against DTrace.
"At first glance, many [users] try to infer commonalities to existing
observability tools. It does not take long, however, to realize
that a new paradigm has been established," says Jenson. "At every
site that I have used DTrace, a line of developers and support personnel
forms asking for their application to be next. DTrace is a competitive
advantage that everyone should utilize."
IBM AIXTrace, Linux Trace Toolkit, and Microsoft Event Tracing for
Windows are the most widely used alternatives. Each records a small
amount of predefined data at a few predefined points. These tools
can be used only for questions that can be answered with these points
or with the data that they provide. In contrast, DTrace has tens of
thousands of probes, can instrument running applications without restarting
them, and can record arbitrary data at each probe -- all on production
systems. With DTrace, you can query the system arbitrarily, receive
a precise answer in seconds, and take immediate action to resolve
the problem.
"I've been using DTrace to fetch details of disk I/O. Now I find
systems without DTrace can feel uncomfortable," says Brendan Gregg,
a UNIX developer and security consultant in Sydney, Australia. "I
keep wanting to fetch more details, and they aren't there. You could
say that DTrace is addictive."
Every programmer, every system administrator, and every IT manager
faces inexplicable performance problems that bog down their systems,
bleed network resources, and squander the company's money. But you
don't need magic to tackle these problems. All you need is DTrace.
|