Avaya Group Safeguards Internal Information Quality
Avaya maintains more than 100 terabytes of customer, vendor, service, financial, and pricing data. To ensure that the stockpile complies with internal data standards, the company, which provides telecom equipment and services, last year established its Data Quality Center of Excellence.
The center’s two dozen employees are responsible for implementing data quality management practices, such as avoiding the creation of duplicate records. Championed by Guy Lardieri, VP of strategic initiatives and business architecture, the center was created in April 2005 as a spin-off from a project to replace an aging system with enterprise applications from SAP and Siebel Systems.
The legacy system was hamstrung by defective data that drove up expenses and cut into revenue. In some cases, Avaya serviced customers’ telecom equipment but didn’t bill them for gear that was erroneously left out of service agreements. Other customers who paid only for standard service were getting premium service because of database errors, says Rich Trapp, Avaya’s global data quality director.
Avaya’s Data Quality Center of Excellence provides the tools for improving and maintaining data quality, including Business Objects’ IQ Insight profiling tool for identifying bad data. The Data Quality Executive Council, made up of top company executives, "provides the teeth for the data quality efforts," Trapp says. But it’s the business units that have ultimate responsibility for data quality.
Workgroup BI Poised for a Comeback
3/25/2009 | By Stephen Swoyer
Remember those rogue business intelligence (BI) and data warehousing (DW) deployments that first came to the fore in the 1990s? They’re coming back. Workgroup BI didn’t die, of course — it was just dormant.
With the advent of several explicit end-user-oriented offerings — including the much-anticipated Project Gemini from Microsoft Corp. — workgroup BI is poised for a big comeback.
Researcher Gartner Inc., for example, recently talked up its return, citing the efforts of upstart vendors such as QlikTech, Pentaho, and others to fill the void created by the withdrawal of established BI powers, which — over the last half-decade — have shifted to "focus most of their resources on delivering enterprisewide BI solutions," according to Gartner.
One upshot of this, wrote Gartner analysts James Richardson, Kurt Schlegel, Rita Sallam, and Bill Hostman in Gartner’s January BI Magic Quadrant report, is that a number of vendors — QlikTech, Lyza Inc., and (once Project Gemini ships next year) Microsoft among them — deliver "solutions for personal and workgroup BI requirements using disruptive technology" such as in-memory analytics. If anything, the Gartner quartet speculates, workgroup BI could prove to be an even more attractive play over the coming year, particularly — if, as seems likely — the economic outlook doesn’t improve.
"It seems likely that despite the inherent risk of silo-perpetuation, workgroup BI’s light-touch nature will prove attractive in a recession," Richardson, Schlegel, Sallam, and Hostman write.
Not surprisingly, advocates of workgroup- or end-user-oriented BI offerings agree. Far from being rogue or anomalous product entries, they contend, their offerings address the otherwise unaddressed requirements of real-world users.
On its surface, workgroup BI may look like an inelegant solution — a Band-Aid, even — to a very messy problem. In contrast to the centrally-managed and centrally-deployed enterprise BI suite, workgroup BI touts a seemingly anarchic topology: it brings data processing — and in many cases, the data itself — back to the end-user desktop.
Workgroup BI offerings — chiefly those marketed by Lyza (developer of LyzaSoft), QlikTech (developer of QlikView), and Microsoft (which is still developing Project Gemini) — usually include spreadsheet and basic visualization capabilities. They promise to make it easier for users to get at the data they want, run the analyses they want, and — this is key — perform seemingly unlimited what-if analysis. (Workgroup BI offerings tend to be sold with a caveat: namely, that IT must first perform the requisite background data integration. So they’re not a completely out-of-band proposition.)
Advocates say that workgroup BI "empowers" users to do — to fish, so to speak — for themselves; critics contend that it will lead to Information Babel — with multiple versions of the truth; they claim it’s rife with security concerns, and that it poses a host of manageability problems.
Both sides are probably right: few dispute that the classic workgroup BI tool — a Microsoft Excel spreadsheet — has also given rise to that most regrettable of practices — spreadmarting.
However, the workgroup BI offerings touted by QlikTech, Lyza, and (in the upcoming Project Gemini) Microsoft aren’t classic spreadsheet tools, proponents maintain. They’re of the same class that Robert Kugel, an analyst with BI and performance management (PM) consultancy Ventana Research, aptly describes as "better-than-spreadsheet spreadsheets." That is, they are tools that marry data centralization and data reconciliation features, governance, and — yes — manageability with a spreadsheet’s famous user-self-serviceability.
The new crop of workgroup BI offerings, unlike the better-than-spreadsheet spreadsheets of old, do away with still another of IT’s most strenuous objections: call it resource competition. One reason that IT folks try to limit both access to data sources and the kinds of queries that users can run is that they’re concerned about the potential for havoc. A single "bad" query can bring down a database, after all. Enter workgroup BI, which takes the database out of the loop by bringing both the data and the data processing back to the end-user desktop. In the workgroup BI model, proponents maintain, users can run any kind of query — no matter how involved (or how inelegantly structured, for that matter) — against their local data store. In both Lyza’s and Microsoft’s cases, that local repository is an in-memory column-based store, which makes it ideal (Lyza and Microsoft officials claim) for analytic number crunching. QlikTech, likewise, does all of its processing in-memory, which representatives say also accelerates analytic performance.
It’s a pitch to which business users are particularly receptive, proponents maintain. "We’ve done all of this work with BI across the whole industry. We’ve built all of these technologies. We know they’re effective, we know they’re reliable — yet we’re still only getting 20 percent penetration across businesses," says Donald Farmer, principal program manager for SQL Server Analysis Services with Microsoft.
The shift to enterprise BI — or (in Farmer’s parlance) to an "enterprise hierarchy" — hasn’t helped, he contends. "This whole move to the enterprise hierarchy [is] driven primarily by IT, not by the business. IT can become inadvertently a bottleneck that constrains [a business’s] resources. Business users don’t get necessarily what they want."
On the other hand, Farmer points out, many business users enjoy working in Excel. They like it precisely because it lets them slice, dice, and manipulate data — to "actually change numbers," as he puts it — in situations where they’re frequently constrained by the limitations of their BI tool or, just as likely, by a lack of responsiveness on the part of in-house IT.
"The number-one feature of every BI tool? It’s always Excel integration. Why is that? Because Excel gives you the power, it gives you the flexibility. So how do you get that power, that flexibility, and yet at the same time get that responsibility, get that compliance?" Farmer’s question is rhetorical: as far as he’s concerned, Project Gemini will do just that.
Other players stake out similar ground. Take Scott Davis, CEO of BI start-up Lyza. In its Lyzasoft product, Lyza currently delivers what Microsoft has at this point promised to deliver — next year — with Project Gemini: an end-user-oriented analytic workbench, complete with a desktop-based columnar data store.
Davis, for his part, claims that Lyzasoft and Redmond each separately cooked up their respective schemes. He spins it as a case of parallel evolution that — especially when considered alongside rival Qliktech’s similar strategy — demonstrates a pressing market need. "If Microsoft hadn’t been screaming exactly the same message that we’re screaming, we’d probably find it rougher sledding. The fact that their research lines up almost exactly with our own research conclusions, the fact that what they’re delivering is parallel to what we’re delivering, that’s a validation of what we’re doing," he contends.
Davis says Lyzasoft’s vision goes further than Microsoft’s; he talks up what he describes as an "enterprise-grade social computing infrastructure" — delivered via Lyza — which his company plans to deliver (on an iterative basis) in forthcoming releases. Borrowing from what works in social networking to make BI tools more usable might sound like heresy, Davis concedes, but it’s of a piece, he contends, with what’s driving Workgroup BI.
"There’s a fundamental distinction between tools [that are] built for engineers and tools that are built for non-engineers. The former [class of tool] rightly assume[s] that engineers will not start work until they know what they’re going to do. But non-engineers don’t work that way, and that’s the fundamental problem with most of the [existing BI tools] out there," he claims. "Engineers want to do things the ideal way. Users just want to do things."
Workgroup BI has its detractors, of course. Among other concerns, they raise valid objections about its scope: as a Band-Aid-like approach, one critic contends, Workgroup BI emphasizes short-term gains at the expense of a much bigger long-term return on investment.
There’s also a belief that the Workgroup BI model amounts to a repudiation — in practice, if not in principal — of much of what BI and data management professionals have struggled to achieve over the last 15 years. "Sure you can take some data and put together something in a couple of days. As a former consultant for a large global firm, we would do this all the time with many BI tools to demonstrate capability. The reality is that without a data warehouse behind it, [a Workgroup BI tool] quickly becomes another disconnected, stove-piped application," says a BI Architect and QlikView user who spoke on condition of anonymity.
In the cases of both QlikTech and Lyza, data integration (DI) — which this architect says is always the most time-consuming aspect of any enterprise BI effort — is primitive, at best, taking the form of generic ODBC or JDBC connectivity, along with hand coding. (Project Gemini, on the other hand, boasts canned DI, thanks to Microsoft’s SQL Server-based BI stack. Microsoft still hasn’t articulated a coherent metadata strategy that spans both its SQL Server and Office BI toolset.)
For the record, this BI architect thinks QlikView a fine dashboarding tool. He bristles, however, at the notion that a Workgroup BI tool should — or could — supplant a well-managed BI and DW infrastructure. "While this might be good for a department or very small company, it is disastrous for a medium-size to large one. It becomes one more stove-piped application to support, like a spreadsheet on steroids. In the case where you already have a data warehouse, then [a workgroup BI tool] becomes just another front end. From this perspective, there is no real time saver versus any other front end."
Wayne Eckerson, director of TDWI research, shares the architect’s sentiments but says workgroup-oriented offerings tend to exploit user pain points,offering best-of-breed functionality with low-cost. “Departmental purchasing ... is alive and well,” said Eckerson, in an interview earlier this year. “Best of breed tends to dominate, since vendors who play here are specialized, new, hungry, or all three. [It provides] good value for the price with advanced technology [that’s] free from the architectural burdens of the bigger players.”
On the other hand, Eckerson continues, all-in-one BI — with the purported cost efficiencies of standardization — is still attractive to many customers. "[E]ven before the downturn, large companies were looking to cut costs by reducing the number of suppliers, systems, and disparate applications to support and to conform with compliance mandates regarding data and financial reporting," he said. "The leaders make it easy to deliver all BI from a single vendor. What could be more standard than that? Plus, these guys tout end-to-end integration of operations and analytics and everything in between."
Nevertheless, Eckerson continues, there will always be a place for best of breed products. This is precisely the need that most so-called workgroup BI offerings are tapping into. The mistake, if any, is to view workgroup BI-oriented offerings as in toto replacements for a well-managed BI and DW infrastructure.
"Most organizations recognize that you need to provide the right tool for the job and that standardizing on a single vendor for everything may saddle you with less-than-optimal toolsets," he concludes. "So many companies standardize on different toolsets for different tasks."
Stephen Swoyer is a technology writer based in Athens, Ga.
Driving Sales
Saab revs up its customer satisfaction efforts to speed its revenue growth.
Saab car owners have a reputation for being some of the most loyal in the industry. Why, then, would executives at Saab Cars USA feel the need to create a 360-degree view of customers and prospects? Two words: sales and service.
Saab Cars USA, a wholly owned subsidiary of Saab Automobile AB (owned by General Motors), imports and distributes Saab automobiles. Its approximately 220 U.S. dealers sold about 38,000 new cars in 2002; the goal is to sell more than 45,000 cars in 2003. “Our CRM initiatives will play a big role in helping us get there,” says Robert Henry, manager, eCommerce and
CRM solutions of Saab Cars USA, in Norcross, GA.
Saab Cars USA rolled out its enterprisewide CRM solution and strategy, dubbed TouchPoint, beginning in January 2002. Saab is using TouchPoint to improve customer service efforts, as well as to support customers and dealers. The initiative focuses on the customer interaction center, marketing, lead management, and data quality.
Prior to TouchPoint Saab Cars USA had about five systems in place, but they weren’t integrated. “One thing we wanted to do was have a consolidated view of existing customers. We didn’t have a system or solution in place that tracked our customers,” Henry says. “We had to go outside to purchase our own customer names.”
Creating a 360-degree view of its customers would allow Saab to use more sophisticated, multistage marketing campaigns, to improve the efficiency and functionality of the call center, and to share data across the organization. “The homegrown system we had worked fairly well for what we needed it to do, but wasn’t enterprisewide,” Henry says. “Data stayed in there and couldn’t be used in other parts of the organization.”
First Gear: Buy-In
Many CRM project leaders struggle to convince upper management of the value of CRM, but Henry and then-Director of CRM Dan David (now vice president of parts and service; currently Patrik Riese is director of CRM) had no problem convincing top brass to buy in. The trick was convincing them to go with Siebel to serve a company that has a staff of only 143. “Executive buy-in is critical, especially for solutions of this magnitude. Siebel isn’t cheap and integration costs aren’t either,” Henry says. “It’s a big commitment to do enterprise CRM — and a big expense.”
Henry and David put together a comprehensive business case and delivered presentations to then-President Dan Chasins (Debra Kelly-Ennis, from GM’s Oldsmobile division, is now president), CFO Ken Adams, and Vice President of Marketing Hans Krondahl. Fortunately, parent company GM had standardized on Siebel 6.3, “which helped our argument,” Henry says. “Plus, the implementation was part of a global CRM initiative.”
One of the key selling points was the marketing initiative, for two reasons: 1) Saab had no way to track leads faxed to its dealers; 2) it had to buy its own customer names from Polk to conduct marketing campaigns. Using Siebel to define and run direct marketing activities in-house would save money, even though Saab would still purchase some data from Polk. “Marketing is where significant ROI comes in,” Henry says.
Once the executives were on board it was time to get the users — contact center agents and dealers —to buy in, too. Saab used both software and instructor-led training for its agents (who are actually employed by EDS), and over a two-month period before the launch sent out about five newsletters that described the new solution, discussed the differences between it and the previous system, and explained the more important role agents would be playing by being at the core of the customer database. “The system was significantly more robust than what they had been using. You have to know what you’re doing or you can get lost,” Henry says. “But right from the beginning everything went surprisingly smooth.”
Call center manager Dick Rommich agrees. “The agents have adapted extremely well to the Siebel system,” he says. “And we have several people who have gone above and beyond to identify issues and offer solutions to various challenges in the system.”
The dealers, however, were a different story. Saab piloted with 40 dealers in January through July 2002, but waited for a full rollout, because instructor-led training wasn’t necessary and the e-learning course the CRM project team put together for the dealers wasn’t ready until July. By October 90 percent of the dealers had completed the training and had started receiving leads through TouchPoint.
Although the training went smoothly, the uptake wasn’t immediate. “When we went live in July we did a campaign to win a PDA: If you signed up, in August we would distribute leads; the first ten to use the system would win a PDA. We got about eighty-five dealers,” Henry says.
So the CRM team ran another campaign. The company had received about 40,000 leads for the new 9-3 sports sedan. “We said, ‘This is how we’ll be distributing the new leads,’” Henry says. Saab also offered a $50 American Express gift card to any dealer who completed the training and then received a password to the system. Those who did were entered into a raffle for two round-trip plane tickets. All but 20 small dealers signed on.
Second Gear: Ramping Up the Contact Center
The cornerstone of TouchPoint is Saab Cars USA’s customer interaction center (CIC). The first phase of the TouchPoint began in January 2002 with the CIC, customer service, lead management, and dealer component of the solution. Saab implemented Siebel eAutomotive 6.3 in its CIC, and gave each of its dealers Siebel eDealer for lead management.
“The central application for dealing with customers is the call center application. That is the most up-to-date customer data. So it was the best place to start,” Henry says. “For marketing to work properly you need data quality. That’s a huge factor.”
Saab previously had a customer assistance center and an outsourced lead generation center. “With Siebel we were able to bring this in-house in the CIC. So agents are cross-trained and are part of Saab,” Henry says. “The lead generation partners were answering phones for thirty or forty other vendors, so they didn’t have much brand understanding. Now that it’s in-house, people are more excited about the brand.” Saab Cars USA has about 45 employees in the CIC: five agents for lead management, about 30 for service—the rest are managers.
Not surprisingly, customer satisfaction is important to Saab. And TouchPoint is already generating results in that area. “We’ve seen customer satisfaction ratings going up already, from 69 percent to 75 percent,” Henry says. “Siebel is only one part of why that’s gone up. We have some excellent employees and managers.”
Call monitoring software from Witness Systems that Saab began using in January “has had as much influence on the success of our center as Siebel has,” Henry says. “Agents love it. They were nervous at first, but are getting used to it. Before, managers would use a tape recorder to listen in on calls. It’s a significant addition to our center.”
Although Witness itself was “pretty much a turnkey solution,” Saab had to make adjustments to its Avaya switch at a significant cost, in the tens of thousands, Henry says. The cost was worth the return. Managers can now monitor such things as voice and movement around the system. “Through listening and watching, managers could tell how well agents are using the Siebel system,” he says. “It’s been tremendous.”
Saab also wants to improve communication with its customers, so it uses Siebel’s email management system. Email comes into Siebel, creates a contact record, and agents can respond thru Siebel. Saab previously used KANA, which Henry says was excellent. “Our main reason for moving to Siebel was to have all customer contacts in one location,” he says. “It was just one part of collecting all this information.”
As far as communicating with dealers, any lead generation activities go through Siebel. Even if dealers are not on eDealer, they will receive an email about any lead. All leads go to the fulfillment center, and Saab tracks and uses that information for marketing.
Once Saab sends the leads electronically, the goal is to have the dealers update the system with follow-up information. “We’d like to know when they make the initial contact and if a test drive was taken, because if they test drive they’re more likely to buy—then the final disposition: Did they buy the car? If not, why not?” Henry says.
The response from dealers so far is not as high as the marketing team would like it to be, so Saab is getting its field sales force more involved. “The dealers are not updating the system enough, so we’re working diligently on getting them to.” TouchPoint generates bimonthly reports that list dealers, leads, and follow-up rates. District managers then use that information to discuss the status of leads with dealers in their territories.
Part of the problem was a miscommunication about the types of leads dealers were receiving. “Back in October we were sending out leads, but they were not all hot leads,” Henry says. “We communicated this to the dealers, but not well enough. We hoped the dealers would treat [these somewhat interested customers] differently. But the dealers approached them in more of a hard sell approach, which pissed off customers.
We’ve worked since then to improve the quality of the leads. It adds to the costs, but by leveraging the lead management tool we call people to verify their interest, then pass on to dealers only leads that would be valuable in their eyes.”
According to Henry, the ability to qualify leads has been another advantage of pulling the lead management team in-house. The team also has the time to call customers and dealers to follow up on leads. “Maximizing their time is a cost savings for us,” Henry says.
Third Gear: It’s All About The Data
In May 2002 Saab Cars USA implemented Firstlogic ACE data quality software. “For our first phase back in January, we implemented the Siebel connector for Firstlogic. It checks, validates, and standardizes data as is comes in, in real time.” This is important, Henry says, because as Saab increases the number of lead sources and data integration points, it needs to eliminate duplicates at the point of entry.
Cost was a significant factor in choosing Firstlogic, Henry says. “For our money it’s doing what we need it to do. We can go in and weigh the different values on different fields, for example, how important is last name, zip code, etc.? Depending on the weight of each is the result,” he says. “The ability to match data is very complex. We have a good solution in place.” Using Firstlogic Saab was able to reduce its database size by 50,000 records. The company currently has 300,000 customer records in its database, with another 500,000 prospect names contained in the system.
The next phase was implementing Siebel Marketing. Saab’s marketing team uses the consolidated data from Firstlogic in conjunction with the Siebel campaign management software to automate and run highly targeted, multistage marketing campaigns.
One basic program that Saab has put in place is an outbound telemarketing campaign to new owners to verify their contact information and ask about their satisfaction with their car. Agents update the customer’s record with information from the call.
Saab now also has a long term—lease loyalty campaign in place. Depending on such selection criteria as whether customers are 12, nine, six, or three months away from the end of their lease, the system creates a mail file that sends information to fulfillment, which sends the appropriate materials to each customer. The system also creates a campaign record so agents can see which customers were contacted, what the contact was, and if the customer responded (e.g., redeemed a certificate or called in).
Cruising Speed
It’s too early in the initiative to give specific ROI numbers, Henry says. But Saab Cars USA has seen its share of positive results so far.
“The biggest impact has been to consolidate all our customer and prospect data into one place where it is accessible to multiple parts of the organization, and all the benefits that brings,” says Director of CRM Patrik Riese.
It’s the first time in Saab’s history that it has had a consolidated database. According to Henry, there are still some data quality issues, but without an employee dedicated to data quality, this is not unexpected. “I don’t know if you ever have clean data, but it’s more pronounced now, because we have the marketing tools in place,” he says.
Saab expects to see cost savings, especially in marketing. “We will use our data to do better and more effective targeted marketing to maintain and improve our customer relationships,” Riese says.
“Learning from campaigns to see if what you’re doing is adding value or just adding cost will be a great benefit,” Henry adds. “That’s where you see the improved efficiency and costs savings.”
Sending leads electronically is one significant improvement. “Once we receive feedback consistently, that will help us make the proper marketing decisions,” Henry says. “Dealers are on the frontlines. If they give us feedback we could refine campaigns to bring in better-qualified and meaningful prospects.”
The CIC is benefiting, too. In time Saab will be able to reduce head count with the efficiency of the Siebel system. Although Saab doesn’t have specific efficiency improvement numbers, as a result of using Witness “soft ROI is significant,” Henry says. Even the realization of being recorded has affected the behavior of the agents.
Customer satisfaction ratings have increased from 69 percent to 75 percent, and Saab expects that this will translate into increased sales. “That our customer satisfaction rating is high speaks well of our [CIC] managers, that they train and manage agents well.”
And it speaks well of the entire CRM project team that Saab hit its deadlines and budget. “We did factor in that through improved loyalty we would increase sales,” Henry says. “We’ll see those result after we see the learning from marketing. That will come in time as we improve.”
Saab Cars USA’s CRM initiative came in on time and under budget. And that was just the beginning. Saab has also:
• Created a 360-degree view of customers, because all customer data is in one, centralized location
• Increased customer satisfaction ratings from 69 percent to 75 percent
• Deleted 50,000 duplicate customer records
• Reduced costs significantly, because it no longer has to buy its own customer names
• Upgraded from faxing leads to sending them via email
• Improved in-house CIC agents’ efficiency and enthusiasm; they are more excited about the brand than outsourced agents were.
Hazardous Data
Allowing dirty data to populate your database can contaminate your business processes. You may not need a haz-mat team to clean up the mess, but until you scrub it down youre never going to realize its full potential.
Robert Regis Hyle
No business wants to look foolish in the eyes of its customers, yet for some companies it happens every day. Customers receive multiple copies of an insurers privacy statement or marketing material touting a new annuity product. Instead of feeling secure because they know their private information is safe, or excited about a new investment opportunity, customers are left laughing and shaking their heads. Claudia Imhoff, president of Intelligent Solutions, speaks with insurers all the time about this problem and sometimes feels more like a counselor than a consultant. Its kind of like being in an AA program, she says. You first have to recognize you have a problem.
Large corporations constantly are dealing with the age-old problem of internal communication. Insurers have silos of processes and silos of workflow and dont realize the data they create in these processes and workflows actually is used in other processes and workflows, says Imhoff. Not in the way they assume it will be used.
James Fridenberg, vice president applications development with Farmers Insurance Group, says there is one thing insurers need to remember: The data has to come first. When that happens, many problems are solved, and even more important, many opportunities are opened for carriers.
An Easy Sell
Its an easy story to sell to upper management, Fridenberg believes, especially when you can point to specific problems and potential problems that will arise from not having data-quality initiatives, checkpoints, and touchpoints. When you are dealing with 15 million customers and consider the volume of change and the number of transactions we do a day, the impact [of poor data quality] would be pretty severe, he says. Implementing data-quality initiatives can be costly, but the potential for savings is tremendous. Its not a hard sell when you can tell a story of what the potential could be by not having stringent data-control initiatives in place, says Fridenberg. I believe you could draw a good business case for ROI by putting in data-quality initiatives. The potential for creating impact is substantial when youre dealing with large systems such as ours, so the ROI is there through cost avoidance.
Farmers has worked hard on improving the quality of its ad hoc marketing data for over three years, according to Fridenberg. And the company has found poor quality has an impact on customers, agents, and service centers. This is a serious business, and we take our applications and our data very seriously, he says. The importance of data quality is ingrained in us.
Multi-line insurer Consecos decision to bring its data cleansing in-house was financial, asserts Tom Besancon. As assistant vice president of marketing information and technology, Besancon says, We turned to it to save us some moneyreduce costs in terms of standardizing and cleaning up some of our [customer] names for mailings.
The suite of tools Conseco purchased from software provider First Logic not only helped to clean up the companys data, it offered the opportunity to enhance the insurers data as well, particularly some of the prospect lists Conseco purchased for marketing its products. We were looking at the tools to improve some of our modeling efforts by using matching and consolidation to get a single view of the customer, he says. We are doing a lot of ad hoc merge purgespurging privacy mailings and previous campaigns to reduce costs and stay in compliance with the privacy stipulations and laws.
Fill in the Blanks
Cleaning up data is important, Imhoff believes, but improving data can be done easily if everyone dealing with the data understands the needs of other departments within the company. One of the problems I see all the time is the claims team will fill out forms so it can do its job, which is to process claims, she says. There are lots of fields in a claims form, though, which have nothing to do with paying for the claim. Some of those fields are useful down the road in analyzing the claimhow did it happen, what were the reasons, what were the dates of the claimand can be very useful in showing patterns of fraud.
The problem is those fields are meaningless to the people trying to close the file. They simply say, Is this a valid claim? If it is, they pay them, says Imhoff.
Administrative people entering data dont realize what they are entering into the system is being used elsewhere in the company, even if no one in that particular department is using it. Sometimes its just recognizing the problem, she says. Once its recognized you can start to say, What can we do to fix the problem?
The ROI
At that point, Imhoff suggests, a bigger problem comes into play: ROI. How do I change someones business processes without affecting the bottom line? she asks. If I make claims clerks look up codes or verify dates, I slow them down. If they are like most order-entry clerks, they are paid by the number of claims they enter in a day. Part of the quality problem is looking at the holistic view and saying, Is it worth the time its going to take and the money were going to lose because were not getting in claims as fast as we used to?
Most companies have the correct information about a customer, they just have too much information about the same person, and that leads to confusion. Imhoff says data-cleansing tools really shine in this environment. If it is the customer information youre worried about and you have multiple instances of the same customer, then the tools can handle that easily, she says. Using her own name as an example, she says Claudia Imhoff can appear on data as C. Imhoff, C.M. Imhoff, or Dr. Claudia Imhoff. All these are different versions of me, she says. Data-cleansing tools can clean all that up and consolidate it into a single record, which is what you want.
The (Almost) Perfect Data
There is no such thing as perfect data, according to Imhoff, but companies can get close, and the closer they get, the less money its going to cost them in the long run. You try to get the data to the point where it is as good as it possibly can be, she says. There are always subversive things that will cause the data to be maybe 99 percent perfect, instead of 100 percent, but thats better than what most organizations are dealing with today.
There is more than just customer information to contend with, however. Insurers have product and claims information, and health insurers have lists of providers. Are they in there multiple times, says Imhoff. You dont want to go in and fix the whole thing at once. You want to go in piece by piece and slowly work through the enterprise data.
That means establishing priorities, though. What most insurance companies do that Im familiar with is to prioritize the data and decide which pieces are most critical right now, says Imhoff.
Fridenberg describes it as peeling back the layers of an onion. The decision has to be made to start with the areas insurers feel will have the most immediate impact on their customers, the agents, and the service centers. Those things are typically anything that has to do with balancing, in the sense of accounting or GL, he says. Commissions are always top of mind, or anything fee related or premium related. Those are things that are touchpointscritical path items.
Cynthia Saccocia, senior analyst in the insurance practice for the research consultant TowerGroup, believes the uniqueness of insurance contributes to the quality of data it can collect on its customers. She believes that is one reason a number of companies are pushing to have their independent agents licensed in both the P&C field and with financial products.
Insurers have typically maintained a silo organization, she says. They are dealing with a long legacy of doing business a particular way. Now, convergence in the marketplace is pushing them into a space they havent been accustomed to working in, and they are yet to get really comfortable.
She believes data collected by insurers is superior to what financial services companies can get on their clients. Typically, a customer goes into a bank for episodic-type advice, says Saccocia. Something that is very oriented to point in time. Insurers have a wealth of data they can support in cross-selling opportunities and be very targeted.
Ive Got (Algo)rithms
The software tools are incredibly sophisticated today, Imhoff points out. A program is made up of hundreds or thousands of algorithms that comb the data. Once you set [the program] free, it is off and running and can do whatever you want it to do, she says. Its just a matter of getting to that point. Imhoff advises insurers not to try to build their own tools, though. You would have to write those hundreds if not thousands of algorithms, she says. You dont want to do that. You might as well go into the [data-cleansing] business if youre going to do that.
Insurers should consider one example, she explains: What if you wanted to send a mailer and you realized 20 percent of [the names and addresses] were dupes? she asks. How much money would you save if you didnt spend the money [for the duplicates] on the postage and so forth. I imagine [the savings] would more than pay for the tools, especially if you have millions of insurance policies. Now youre looking at a customer instead of five.
Finding the Right Tools
Insurers shouldnt complicate things from the outset, Imhoff warns. I would look for a tool, first of all, that is easy to use, easy to set up, easy to understand, and easy to maintain, she says. Dont expect all data-cleansing tools to be simple, though. Some of them can be difficult, she adds. They are very esoteric in some respects, so I look for ease of use.
The second step in the selection process is the level of sophistication required by the insurer.
A third step is determining whether the system maps to the current technology the insurer has in place. Can you use it on all your databases, or are you limited to mainframes? she asks. You need to look for a map in terms of the technology.
The fourth step often is overlooked. Thats support from the software company itself, she says. I dont think most people think about that. She believes the vendor has to be fully supportive of the project. Will they help me, not just working the tool, but in analyzing the results from what the tool gives us, she says. What is this data telling me? What do I need to do? What needs to change?
Besancon says Conseco had a specific need when it began its product search. We were looking for something that could sit on a UNIX box and process loads of data quickly, he says. When shopping around, insurers will find a wide array of products. There are products out there that are PC based and larger products as well, he says.
There also are plenty of outsourcing options, Besancon mentions. Although Conseco has purchased its own tool, it is keeping its options open. The First Logic product has a suite of tools, which will allow Conseco to add on over time. But the company also kept some of its outsourcing options open. Were not putting all our eggs in one basket, he says.
The historical problem of data for insurers deals with the expensive process of building a data warehouse, trying to cleanse the data, and defining it, according to Saccocia. Insurers have had a lot of starts and stops with those types of activities, she says. Now theyre at a point where they have to step back and say, Are we doing a good job in building this? What are our core objectives to achieve it? That is when they need to look for vendor support to move to the next level.
She believes the typical approach to data solutions for insurers was to build a system because insurers felt they had unique needs. In our conversations, were finding many vendors that may be horizontal in nature and are trying to become more vertical for the insurance industry, says Saccocia. The vendors can provide specific tools for an individual industry. The data is there, she says. It just needs to be better managed. This will become more important as insurance leaders change their view of what an insurance company is. They are becoming more brokerage oriented, and they view themselves as competitors in the financial services marketplace, says Saccocia.
Worth the Effort
Conseco is saving a great deal of money today in printing costs and postage because it has eliminated many of the duplicate names and addresses from its database. But from a marketing standpoint, Besancon believes the ability to track the success of a mailer will be invaluable. We go back to the tool for response analysis, he says. If we do a mailing of, say, 50,000 people, we actually can determine if they bought a product by checking the database. He claims the company had ways of gauging its marketing success prior to purchasing the software, but the cost was just way out of hand.
Besancon says Conseco has surpassed all expectations of the project. Our success rate has been better than imagined, he says. We didnt realize how much money we could save by bringing this in-house.
The Industry's Dirty Secret
By Therese Rutkowski, Managing Editor
October 1, 2003 – “Garbage in, garbage everywhere.” That’s a twist on the old adage, “garbage in, garbage out,” courtesy of Firstlogic Corp., a La Crosse, Wis.-based data quality software provider. “We say, ‘garbage in, garbage everywhere’ because so many systems share data that bad data in one spot can easily propagate across the entire organization,” says Chris Colbert, industry marketing director, at Firstlogic.
Bad data can also spread across organizations, as David Jokinen discovered when J.P. Morgan Chase & Co. identified him as deceased in its systems-instead of his mother, who passed away in April 2001.
For more than two years-as reported recently in The Wall Street Journal-Jokinen has been trying to correct J.P. Morgan’s error and convince mortgage brokers, credit card issuers, car dealers and insurers that he is very much alive and deserving of their products and services.
Indeed, the financial services industry is converging. And as it does, insurers, banks, brokerages and their business partners are exchanging information electronically in real-time to process transactions more quickly, to improve customer service and to reduce costs associated with manual workflow.
“We share data with agents, we share it with people who do claims processing, with banks that offer mortgages, and with risk managers,” says Mele Fuller, interface architect, at Seattle-based Safeco Corp. “There are many organizations with whom we share our data-and it’s growing.” (See “The Industry Standard for Consistent Data,” page 24.)
The industry also is implementing customer relationship management (CRM), data warehousing and business intelligence solutions. The intention is to share data enterprisewide, and analyze it to make more informed decisions more quickly in response to market changes and competitive pressures.
Therein lies the conundrum: The financial services industry, which always has been built on data, is becoming even more dependent on data. And as it becomes more dependent on sharing data-often in real time-managing that data effectively becomes not only important-but absolutely necessary.
“Detroit manufactures cars. You can go to the dealership. You can touch them. You can smell them. You can drive them. You can see the deliverable,” says George Jablonski, P&C enterprise data architect at The Hartford, Hartford, Conn.
“Insurance, on the other hand, sells promises,” he says. “The promises are documented in contracts. And contracts turn into data. And data is stored. Our asset is that data versus a car that you can see. If you understand this analogy, you understand the importance of information as an asset to an insurance company.”
A big problem
Yet, despite the fact that data management is the linchpin of insurers’ operations, a lot of dirty data lurks in their systems. Mr. Jokinen’s tale is one of many stories of a data error gone awry in the financial services world; most go untold.
“The problem is bigger than anyone fully realizes, or is willing to acknowledge,” says Ron Barker, insurance practice area leader at Chicago-based Knightsbridge Solutions LLC, a data management consulting firm.
In fact, The Data Warehousing Institute, Seattle, estimates that poor quality customer data costs U.S. businesses a staggering $611 billion per year. This figure doesn’t even include the cost of losing customer loyalty by incorrectly addressing letters or failing to recognize a customer who calls or visits a company’s Web site (see “The Cost of Dirty Data,” page 23).
“People are getting fed up with getting mail with their names scrambled,” says Jack Hermansen, CEO of Language Analysis Systems Inc., a Herndon, Va.-based multicultural name recognition software provider. “I received a letter that read, ‘Dear Mr. Inc.’ I’m sure everybody has stories like that. A lot of people just throw the mail in the garbage and say, ‘If this is how much this company cares about treating names, what luck will I have calling them and being treated like an individual?’”
Customers are getting fed up, and the government is putting pressure on companies to manage their data better. The Gramm-Leach Bliley Act (GLBA) and the Health Insurance Portability and Accountability Act (HIPAA) both require insurers to protect the privacy of customer data. Similarly, the recently passed Sarbanes-Oxley law mandates that public companies report accurate financial data-with hefty fines and imprisonment as penalties.
Data quality is a hot topic again, says Tracy Spadola, senior industry consultant at Teradata, a division of NCR Corp., Dayton, Ohio. “I’ve been working in the field for 20 years. It had its heyday in the 1980s, and it dipped. But it’s coming around.” With so much more information being captured, shared and scrutinized, companies are asking, “How do we manage it?” she says.
“We’re hearing more and more about data quality and data management because it’s like a pressure cooker,” says William Sinn, vice president of insurance and healthcare marketing at Teradata. “People realize they can’t embark on a lot of business initiatives unless they’ve got good data quality.” (See “Cleaning Your Data-And Keeping It Clean,” page 38.)
Indeed, insurers are investing in initiatives such as business intelligence, data mining and analytical tools to help them correlate policy, claims, demographic, geographic, and other customer and operational data-and respond more quickly to market pressures.
360-degree view
Allstate Insurance Co., for example, is developing an enterprise CRM program which involves infrastructure modifications, an enterprise customer database, analytics, business rules software and change management.
The objective is to create a 360-degree profile of Allstate’s customers-and their households-to assist the Northbrook, Ill.-based company in cross-selling and retaining those policyholders across distribution channels, according to Kimberly Harris, research director, Gartner Inc., Stamford, Conn.
U.S. Risk Insurance Group, a Dallas-based managing general agency (MGA) that distributes excess and surplus lines, also is investing in analytical technology to understand and run its business better. “There’s an increasing need to articulate our business plans and to understand our book of business better,” says Monte Stringer, executive vice president and CIO of U.S. Risk Insurance Group.
“For an MGA to be successful in this current hard market, that MGA has to have an almost fanatical focus on underwriting,” he says.
To that end, U.S. Risk is implementing business intelligence technology from Thazar Inc., a Skywire Software company located in Frisco, Texas. Thazar’s software will enable U.S. Risk to determine what business they’re producing, where the business is coming from geographically, and from what producers. “The more we know about our business, the better we can perform in the marketplace,” Stringer says.
The insurance industry currently is focusing on underwriting results more than in the recent soft market cycle, but the infrastructure in most companies does not support the granularity and level of analysis companies need to truly understand the relationship between risk and costs, according to Tom Chesbrough, executive vice president and founder of Thazar.
Insurers need detailed data about demographics, driving records, vehicles, geography and premium and loss characteristics, he says.
They also need clean data. Data mapping and cleansing is by far the most challenging part of any data mastery project, according to Matthew Josefowicz, senior analyst, at Celent Communications Inc., a Boston-based research and advisory firm. This process typically consumes 80% of the implementation time and resources, and 40% of the overall project from planning to training and maintenance, he notes in a recent Celent report titled, “Insurance Data Mastery Strategies.”
A significant portion of U.S. Risk’s business intelligence implementation involves testing data quality, Stringer explains. Initially, U.S. Risk Group is creating manual reports, and calculating certain known variables. Then, the team is plugging the same data into the business intelligence system to ensure the data is clean and the results are accurate. “Bad data is worse than no data,” he says.
In fact, many insurers became aware of just how dirty their data is when they implemented CRM systems and data warehouses back in the 1990s, says Teradata’s Spadola.
“Once insurers began pulling all their data together, chances are, it was the first time they were seeing it all linked,” she says. “Instead of their marketing data here and their underwriting data there, it was all pulled together-and that’s when many companies realized they had some quality issues.”
It’s also why many executives are now reluctant to invest in data management solutions, according Knightsbridge’s Barker. “A data warehouse alone can cost millions of dollars,” he says. “And there are enough data warehouse train wrecks and CRM train wrecks out there that CIOs are reluctant to pony up the money to support these efforts now.”
With credibility risks, compliance requirements, and competitive pressures mounting, however, insurance executives realize they can’t ignore data quality and data management much longer.
“This is a strategic issue,” says Teradata’s Spadola. “It’s all well and good to say, ‘We know we have data problems, and we need to fix them.’ But it really requires setting up a formal data stewardship role and putting policies and procedures in place that say, ‘We are going to treat our data as a resource, and we’re going to manage it effectively.’”
That’s precisely what’s happening at The Hartford, according to Jablonski. This year, the carrier established an enterprise data unit. And “information/data” is a category unto itself in the company’s information technology investment portfolio.
“Establishing this unit signifies that the business folks recognize the importance of data, and that it’s a good idea for the management of data to be centralized,” Jablonski says. “It will help us in the future to make sure we treat data consistently across the organization.”
Such initiatives have come and gone in the past, he says, but this time it’s different. “This is a very strong effort. The recognition is there that we want to treat information as an asset-and folks here are doing something about it.”
Still, it’s not uncommon for companies to view improving data quality as a one-time project. When they bring in a new system, they see that as an opportunity to clean up their data, Firstlogic’s Colbert says. “But data quality is an all-the-time thing. Data degrades over time. People move. People get married. Obviously, in the insurance business, people die. These changes have to be dealt with on a consistent basis.”
One tool Knightbridge’s Barker promotes is a metadata repository. “Metadata is data about data,” he says. It describes: What is the data? Where did it come from? What transformations did it go through? What happened to it from the time it was pulled from the source system into the data warehouse? How did it change? “Metadata becomes the key element associated with data quality,” he says.
An information architecture approach to data management is also essential, according to Thazar’s Chesbrough. His company promotes a centralized data warehouse-rather than having many data marts-to ensure there is “one version of the truth.”
“Store once, use many,” is a mantra spoken by proponents of centralized data warehouses. “The idea is to start with the data in a single place and build from there,” Teradata’s Sinn says. “You can keep reusing the data, but why store it in 20 different systems when you can have it in one place and pull it from there?”
One bite at a time
It’s important to remember only 10% to 15% of an organization’s data is “enterprise” data-data that it relevant across the organization, The Hartford’s Jablonski notes. “The other 85% lives in the business ‘siloes.’
“Siloes aren’t bad,” Jablonski says. “Many organizations have been set up with smaller units to be flexible and react to business changes. That’s just the nature of the beast.” With an enterprise view of data assets, siloes can still operate as they always have. “We want to provide an enterprise view of information without being disruptive to the business areas.”
At The Hartford, for instance, there are approximately 50,000 total data elements, and only 500 are likely to be “enterprise” data elements, he says. But the pitfall for many companies is “they try to bake the whole cake. They try to tackle mastering all their data in one huge initiative. That’s overwhelming. It’s staggering, and people fumble on it.”
Companies are wise to “think big, but start small” when implementing data quality solutions, sources say. “You’ve got to start someplace, so start at a place you think is the worst, or at least an area that you can clean up, and build out from there,” says Teradata’s Sinn. “It’s like the old adage: How do you eat an elephant? One bite at a time.”
A few years ago, companies built huge data warehouses from scratch, says Thazar’s Chesbrough. “That was very expensive. Now, we’re able to implement systems in components-certain lines of business or certain areas such as claims, in phases.” This way, an insurer can build confidence in the technology, and prove its worth with short-term benefits and return on investment, he says.
In addition, some relatively inexpensive methods of improving data quality can produce ROI quickly. For example, using an address verification tool can cut costs associated with duplicate mailings almost immediately.
“You can narrow down thousands of data records by simply verifying that an address is valid,” says Tho Nguyen, program director in data management strategy for SAS Institute Inc., a Cary, N.C.-based business intelligence and analytics software provider.
“When we compare mailing campaigns after addresses have been verified with previous mailings, we’ve seen as many as 33% of the names dropped because they were invalid,” he says. That translates into significant printing and mailing cost reduction.
Kathy Armstrong, a data quality coordinator at Republic Mortgage Insurance Co., Winston-Salem, N.C., says an automated data auditing tool, which her company purchased from Firstlogic about six months ago, has already doubled her efficiency. Plus, she’ll be able to produce more professional management reports, rather than Excel spreadsheets.
Standing apart from competitors is about presentation, consistency and conforming to standards, according to U.S. Risk Group’s Stringer.
Much of U.S. Risk’s business is written with Lloyds of London, he says. “Every year, when we go to renegotiate our contract with Lloyds, they’re looking at our results. They look to us for data. So the more we bring data that is actuarially sound and consistent with ACORD standards, the more credible that data is to them.
“If it’s not consistent and it doesn’t follow actuarial standards, they don’t pay a lot of attention to it,” he says.
ParAccel’s Coup
7/1/2009 | By Stephen Swoye
Columnar data warehouse (DW) specialist ParAccel Inc. this week announced both a new version 2.0 release of its Analytic Database software and — of especial interest at a time when venture capital (VC) backers are becoming increasingly parsimonious (see http://www.tdwi.org/News/display.aspx?ID=9493) — a fresh infusion of investment capital.
ParAccel’s announcement comes at a particularly fraught time in the high-end DW segment. It’s in this respect that its funding win ($22 million from a number of VC backers, including new investor Menlo Ventures) is significant as it suggests that VC backers might now be moving away from the hardware-only DW appliance model first popularized by Netezza Corp. That company and relative newcomer Dataupia Corp. — along with high-end DW champion Teradata Corp. — remain the three primary purveyors of hardware-only DW appliance systems; another hardware-only player, the former DATAllegro Corp., was acquired almost one year ago by Microsoft Corp. Netezza, a publicly-traded company, enjoys profitability, as does Teradata, which was spun off from parent company NCR Corp. more than two years ago. Dataupia, on the other hand, has struggled to raise additional capital. In March, it promoted a new CEO (Cognos Inc. veteran Tony Sirianni) and, in May, laid off approximately two-thirds of its workforce.
ParAccel, by contrast, trumpets a hybrid software or hardware deployment schema. Its model is similar, in this respect, to offerings from other "third-wave" DW appliance players — i.e., vendors such as Aster Data Systems Inc., Greenplum, Infobright Corp., and Vertica Corp. — as well as seasoned veteran Kognitio (nee Whitecross), which has long sold its DW software separately.
Unlike the hardware-only systems marketed by Netezza, Teradata, and Dataupia, customers can purchase the ParAccel Analytic Database software and deploy it on their own hardware. Alternatively, by working in tandem with ParAccel or hardware partners, they can purchase the ParAccel Analytic Database as a preinstalled option.
For a long time, it seemed as if both models could coexist. Hardware-only appliance players used to tout either the logic or the desirability of the pre-fab appliance model — DW appliance visionary Foster Hinshaw, late of Dataupia, famously championed what he called a "Tivo Test" for DW appliances (see http://esj.com/articles/2008/01/30/qa-data-warehousing-and-the-appliance-model.aspx): customers wanted something that they could plug in and turn-on, not something that they had to size, install, and configure on their own, Hinshaw argued.
It increasingly looks as if the VC community — if not the market itself — has decided to back the hybrid horse, industry watchers say. "You can’t sustain a hardware-based database company," argues a prominent industry watcher with insight into Dataupia’s travails who spoke on condition of anonymity. "Teradata and Netezza have [a] longer life, but in the end they’ll need to change."
In Dataupia’s case, this insider says, there are certainly other adverse circumstances: "[Oracle] wants to sell more RAC licenses, Dataupia reduces the number of licenses [that a customer needs]." Nonetheless, this insider contends, Dataupia’s hardware-only model — chiefly, its inability "to keep up with commodity hardware" — comprises a big problem in its own right.
In this respect, it’s hard not to see ParAccel’s $22 million financing round as a validation of the hybrid model.
New Checklist Features
In addition to its successful Series C financing, ParAccel also announced version 2.0 of its flagship Analytic Database. That release is similar to several recent DBMS deliverables (from competitors Greenplum and Vertica, among others) in that it both fleshes out ParAccel’s SQL feature set — the revamped ParAccel DBMS supports SQL 2003 amenities such as window aggregates and scalar user-defined functions (UDF) — and is said to boost performance, too. Kim Stanick, who manages marketing for ParAccel, says ParAccel 2.0 also ships with a significantly improved query optimizer facility.
In this regard, she argues, ParAccel continues to shed its PostgreSQL origins.
"We’ve invested heavily in a new advanced Query Optimizer for MPP queries that is MPP aware, columnar aware, and does sophisticated and very advanced column-pruning and also does advanced query rewrites. It also has an extended query decorrelation capability for these correlated subqueries," she says.
"We’ve decided that the best thing to do is to build the best brain to run the database. It’s actually the last part of our database," Stanick continues. "People kind of accuse us of being a Postgres replica. We do have a Postgres origin, but we have two parts to our architecture. One is our compute nodes, which handle the networking; the other is the leader node that does the parsing and the planning and the optimization happens."
ParAccel’s compute nodes were "built from scratch" using a new (non-Postgres) database engine, according to Stanick; its leader nodes, on the other hand, retained a good chunk of Postgres code.
The new Query Optimizer replaces a Postgres-based optimizer that was designed chiefly for OLTP database platforms. "Quite frankly, the Postgres optimizer couldn’t plan very well beyond a couple of dozen tables," she says.
ParAccel and other DW specialists — including both Greenplum and Vertica — have variously worked to either refine their open source software (OSS) innards or flesh out their capabilities, particularly with respect to SQL support (see http://www.tdwi.org/News/display.aspx?id=9450).
ParAccel 2.0 boasts several other differentiators, according to Stanick. For one thing, she claims, it delivers improved support for Clariion storage from EMC Corp. "Our blended scan [capability] gives us the ability to work with EMC Clariion," citing ParAccel’s partnership with EMC to deliver a pre-fab analytic appliance.
"Blended scan basically allows you to leverage both the on-server disc that you would get with a normal appliance like us or Vertica or DATAllegro or Greenplum, and you can marry that with a SAN attachment and use all of those SAN discs too. We intelligently blend the data so that the data that’s on the servers is a portion of all of the data."
The key benefit, according to Stanick is that "you’re able to leverage both sets of I/O and both bandwidths at the same time."
Stephen Swoyer is a technology writer based in Athens, Ga.
Start-Ups Mine Database Field
Nimble Software Helps Make Sense Of Information Tide
Nov, 18th, 2007
By Don Clark | Wall Street Journal
Most databases are based on technology that originated 30 years ago. But change is in the air.
A mob of start-ups have been developing variants of the software, which provides the equivalent of filing cabinets for corporate information. Customers say the offerings are generating faster answers to questions that require sifting through huge volumes of business information. Established suppliers aren’t conceding much to the newcomers, but industry executives agree the pace of progress is accelerating.
“The database market is going to be an exciting place to be in the next decade,” said Michael Stonebraker, an adjunct professor at the Massachusetts Institute of Technology and chief technology officer of a new entrant called Vertica Systems Inc.
His opinions carry some weight. Mr. Stonebraker, during a 25-year stint at the University of California, Berkeley, was a major force in the 1970s behind relational databases &mdash the strain of technology in products from companies such as Oracle Corp., International Business Machines Corp., Microsoft Corp. and Sybase Inc. Besides his initial product, called Ingres, he helped develop another database called Postgres that many companies use today.
One reason for the latest activity is the need to make sense of a flood of business information. Web services, for example, generate a stream of information about the activities of visitors to the sites. Companies use “business-intelligence” software to analyze such data, a reason for a takeover wave that includes IBM’s deal yesterday to buy Cognos Inc. for $5 billion.
Corporate-transaction data is typically transferred to software repositories, called data warehouses, where it can be studied using business-intelligence programs. A buyer for Wal-Mart Stores Inc., for example, might want to plan for storm season by sifting through cash-register records of what people in Florida bought just before and after a major hurricane, Mr. Stonebraker said.
Depending on their complexity, such queries can take many hours using standard databases. So companies have developed a range of techniques to speed up the job.
Teradata Corp., a pioneer in data warehouses that recently was spun off from NCR Corp., developed technology to pass information quickly between server systems that come packaged with its software. Netezza Corp., a start-up in Framingham, Mass., that went public this year, helped popularize the idea of “analytic appliances” &mdash a combination of software and servers that are accelerated with the aid of certain chips.
Other start-ups, such as Greenplum, of San Mateo Calif., and Dataupia Corp., of Cambridge, Mass., have developed their own hardware ideas. One of their techniques is to divide up data-warehouse jobs over many inexpensive servers so that adding more computers gets answers more quickly.
One user is iCrossing Inc., of Scottsdale, Ariz., which provides analytical services to companies that operate Web sites. Analyzing a day’s worth of some types of data once took 20 to 22 hours, said Tony Wasson, the company’s vice president of engineering. With Greenplum’s technology, and some modifications to its own software, the job now takes about an hour, he said.
Others are using a different style of software. Relational databases typically store records in rows with multiple columns of transaction informatio. Sifting through all those columns can create delays in getting answers.
Another approach, pioneered by Sybase, accelerates the process by searching only through specific columns that are the focus of a query. Some users of these “columnar” databases rave about them.
Investment Technology Group Inc., a New York firm that provides brokerage and technology services to institutional investors, said its data warehouse has swelled with the heavy volume of electronic trades and associated message traffic. One standard query, which analyzes transaction data over 30 days, once took about five hours, said Michael Dearinger, an ITG senior vice president. Using the columnar software Sybase IQ, the firm gets answers in about 13 minutes, he said.
The columnar approach also is used by Vertica, the Andover, Mass., company co-founded in 2005 by Mr. Stonebraker. Its executive chairman is Jerry Held, an Oracle veteran who worked with Mr. Stonebraker at UC Berkeley. Another start-up that uses a similar technique to narrow searches is ParAccel Inc. of San Diego.
“With columnar databases you are searching only through the relevant haystack,” said Barry Zane, a former Netezza executive who is ParAccel’s chief technology officer.
Some predict specialized products will find a niche. “One kind of database is not going to suit all of the different applications we are coming up with,” said Donald Feinberg, an analyst at market researcher Gartner Inc.
Lyza Empowers New Class of BI Consumers
9/24/2008 | By Stephen Swoyer
To hear start-up Lyzasoft Inc. tell it, there’s a big bloc of potential business intelligence (BI) consumers ill-served by existing products — one-size-fits-all BI suites at one end of the spectrum and client spreadsheet tools at the other. The suites aren’t flexible or customizable enough to accommodate business analysts, marketing analysts, and other potential users. Spreadsheets, on the other hand, are seen as contributing to spreadmart hell.
Lyzasoft officials say this big bloc of consumers has been mostly muddling through on its own. "The dirty little secret of BI is that most [business intelligence] happens outside of that traditional BI stack — that is, people extracting data and going off on their own and doing whatever," says Lyzasoft founder and CEO Scott Davis.
"[T]he fact that a lot of people out at the edge of the organization or the edge of the formal BI community — the fact that they’re sort of doing stuff on their own — is an irrefutable argument that there’s something that they need to do which isn’t terribly well-suited to the traditional business intelligence process. They need autonomy, they need flexibility."
Davis and Lyzasoft co-founder Brian Krasovec (who is responsible for product development) co-founded BI consultancy Eyeris. According to Davis, Lyza is the product of an in-the-trenches development process.
It’s a familiar (if clichéd) product story: in their BI consulting work, Davis and Krasovec were surprised to discover a silent but struggling group of BI users who were either hamstrung by in-house BI tools or laboring ineffectually in spreadmart siloes.
The duo identified self-service — i.e., software flexibility and user autonomy — as the key to turning these constituents into consumers, and as a result developed Lyza, a product that balances two competing — and seemingly contradictory — requirements.
"This is a process of manipulating, visualizing, communicating, synthesizing, working with data in a way that a non-IT analyst can do on their own. It cannot use Structured Query Language. It can’t rely on enterprise hardware because [these non-IT analysts] don’t typically have access to that," he stresses.
"It has to be visual — it has to be WYSIWYG. It allows people to do things on their own in a very flexible, modular way, but it isn’t ’hack and stack on the desktop and build a spreadmart,’ so nobody knows where those numbers came from. It isn’t confined to a sandbox that you can slice and dice all that you want but you can’t move outside the sandbox."
Its impetus is familiar, but Lyza itself is decidedly unfamiliar. It’s an all-in-one BI tool, complete with integrated reporting, ad hoc query and analysis, dashboarding/presentation capabilities, and (mostly) self-service connectivity to back-end data sources, but it isn’t an end-to-end BI suite: it’s an entirely client-side solution. Everything — even ad hoc query crunching — runs on the client desktop.
More to the point, Davis stresses, it runs on either of two popular desktop environments: Windows and natively (not in Parallels, not in a virtual sandbox) on MacOS. Even though it runs on the desktop, it doesn’t limit the kinds of data sources with which consumers can work; the size of working data sets; the counts of records or columns; or the complexity of the transformations or calculations they wish to perform.
What does all of this self-serviceability and autonomy get you? Why do business analysts, marketing analysts, campaign managers, and other potential BI users need full-fledged BI stacks on their desktops?
According to Davis, a tool like Lyza lets them address the one-off, seasonal, or "incremental" projects that take too long to get approved or which otherwise never get funded.
"We’re talking $20,000 worth of incremental revenue, or $40,000 worth of incremental bottom line — but that will never fund a project. If you write a business case for that, it’ll never get funded. However, incrementally, it all adds up," he says. "What we need to do is give these guys a tool that allows them to do their own data collection, their own data synthesis, their own enrichment of that data, and to do it quickly and independently."
Lyza isn’t a spreadmart-type solution, Davis insists.
"Everything that they do to a file, everything that they do to a chart, it’s all captured in a business rules XML document. What that means is that we have metadata on every business rule manipulation from the beginning to the end. That is unique. That is not going to be replicated by anybody using spreadsheets or joint tools," he says. "There is never a point at which somebody says, ’Where did this number come from?’ That’s a big deal for analysts. It gives them a completely new notion of what something means."
It’s precisely Lyza’s desktop-centric pitch that could endear it to business users — and pave the way for out-of-band deployments around IT, says one industry watcher.
"The thing is, it’s desktop. No admins have to worry about anything. It can be a direct end-user sales model. IT need not apply. They’ll still complain, but this is a [lot] better than screwing around with people using Excel invisibly," says Mark Madsen, a principal with BI and data warehousing (DW) consultancy Third Nature. Madsen, a veteran DW architect, says Lyzasoft is the first vendor out of the gate in what’s shaping up to be a burgeoning BI-on-the-desktop revival.
"They will have some competition this fall, but for now they’re first. Market size is hard to gauge. The thing is, the tool is all-in-one, and that means it’s a hell of a lot easier to use than Excel, or any other BI tool including Qlikview," he concludes.
"It’s aimed at analysts. Every company has several people in different departments who do analysis. They have lousy tools, and BI tools don’t do this type of thing. That means the segment appears crowded, but is actually empty."
That’s just the kind of view that should make Lyzasoft’s founders happy.
Stephen Swoyer is a technology writer based in Athens, Ga.
RED Offers Faster Development, Easier Maintenance of
Data Warehouses
6/10/2009 | By Stephen Swoyer
As data warehouse (DW) specialty players go, WhereScape is by no means a newcomer. The New Zealand-based firm was founded more than a decade ago — but until recently it was best known as a provider of DW consulting and integration expertise, not as a product vendor.
That discrete product, WhereScape RED, was one of several intriguing deliverables showcased at TDWI recent World Conference in Chicago. WhereScape, like competitors including Infobright, illuminate, and others, touts a requisite "distinct" take on data warehousing: it positions RED as an integrated development environment (IDE) for DW.
WhereScape pitches RED to DW developers, touting it as a means to both rapidly develop and — even more promising — to more easily maintain a DW environment. To that end, says Mark Budzinski, vice president and general manager with WhereScape USA, RED is able to consume DW source data from several environments (including DB2, Oracle, Teradata, and SQL Server); generate procedural code, scripts, and tables; build cubes; and — that bane of every developer’s existence — create requisite documentation in HTML format.
At the World Conference, WhereScape announced RED version 6, touting the addition of support for IBM Corp.’s DB2 DBMS as one of the new version’s most important features. WhereScape also trumpeted support for Teradata’s Linux scheduler along with unspecified "enhancements" for its constellation of supported DW platforms.
When it comes to platform reach, Budzinski contends, WhereScape’s claim to "support" a specific platform isn’t just a meaningless marketing buzzword. "When I say we ’support’ Oracle, what I really mean is that we actually produce the code that you would otherwise have to write if you assigned a team to develop a data warehouse or a data mart," he says. "RED is an IDE that is specifically targeted for the data mart or data warehouse developer. It’s a technology that completely performs the tasks of programming and managing a data warehouse."
WhereScape also handles the data integration heavy lifting, although Budzinksi says that it approaches things from an extract, load, and transform (ELT) instead of an ETL perspective. "Although we don’t do classic ETL, we do more ELT: we’re essentially creating all of the procedural code, all of the aggregates, ultimately culminating in an actual data mart or data warehouse — up to and including a cube," he says.
"We essentially validate the data model. We create a metadata repository, which is very important."
RED is one of many quick-and-not-so-dirty BI or DW tools — offerings that seem to target end-user frustration with the inertia (or flat-out resistance) of internal BI and DW teams. Depending on who you talk to, user frustration is mounting, has plateaued, or (similarly) is always-already at a constant level.
The salient point, Budzinski says, is that — at any given time — a certain percentage of users (or, more important, a key percentage of executive users) will feel as if its needs aren’t being met. Such users don’t have the time or the patience to wait for IT to develop, test, and implement an enterprise data warehouse, he contends.
"I think a lot of [users] just have much more of a pragmatic perspective. A lot of developers feel this way, too. They’re not here to solve world hunger or build an intergalactic enterprise data warehouse, They’re here to quickly build out data marts and the smaller warehouses that smaller companies need, and the satellite operations that satellite offices in bigger companies need," Budzinksi comments. "These developers are being told ’We have to get something done in a hurry!’ A lot of companies just can’t wait for an enterprise data warehouse, and [for these companies] that are just starting out, they’re fascinated by [RED]."
WhereScape also positions RED as a tool to help speed DW migrations. "If I have something in Oracle one day and it becomes important to relay that down to a DB2 environment, it’s going to take a couple of days to get that done and a couple of weeks to validate and make sure everything is working properly [via RED], compared to the horror story of manually moving a lot of code from Oracle to DB2," he says. "Things you can do in DB2 are different from what you can do in SQL Server. It’s not just canned code. Each environment has different stuff that’s implemented. You as the developer can change the code."
Budzinksi is careful not to downplay the value and attractiveness of an enterprise data warehouse, however. The issue, he maintains, is that an EDW is something that a shop should build up to: an organization must first develop the appropriate in-house skills and (just as important) processes — in both the line-of-business and data management domains — to realize the vision of a useful and functional EDW.
Many shops simply aren’t there yet, he maintains. "There’s a maturity associated with that [EDW vision] and a lot of people haven’t even started with what they need to be doing yet," he explains.
Stephen Swoyer is a technology writer based in Athens, Ga.
Company’s first product is a database management system capable of of all types of decision processing, from traditional data warehousing and analytics to operational business intelligence, online analytical processing, and high-speed query processing.
By Antone Gonsalves
November 7, 2007
Startup ParAccel has launched an analytic database that at least one expert sees as a potentially disruptive technology in the data warehouse and database management system markets.
The San Diego-based company officially launched itself and its first product last week by announcing the general availability of the ParaAccel Analytic Database, a DBMS capable of all types of decision processing, from traditional data warehousing and analytics to operational business intelligence, online analytical processing, and high-speed query processing.
In addition, ParAccel announced a partnership in which Sun Microsystems would offer a DBMS appliance with ParAccel software, which is also available as a standalone database or as a drop-in database accelerator.
In addition, the software can be configured for all-in-memory analytical processing, or for traditional disk-based database-execution deployments, James Kobielus, analyst for Current Analysis said in a recent research note. “It can run on a single massively parallel processing-capable compute node or on multiple distributed nodes with scale-out and high availability.”
Kobielus said the potential impact of the ParaAccel Analytic Database is high on the DBMS and DW markets because of its innovation, flexibility and scalability. “This new release could prove truly disruptive to established segments in which rivals offer point solutions rather than flexible, appliance-ready, analytics-processing solutions,” Kobielus said.
Nevertheless, ParAccel has its shortcomings. For one, it can operate as a drop-in accelerator only with Microsoft SQL Server, and not with the top two enterprise databases: Oracle and IBM DB2, the analyst said. In addition, ParaAccel’s offering competes with products in several market niches, and the startup has yet to prove that its technology is truly the best of breed in any of those segments.
However, ParAccel’s announcement “sends a signal that innovation is alive and well in the DBMS arena,” Kobielus said.
“Rival DBMS/DW vendors should rethink their go-to-market strategies in light of the release of ParAccel Analytic Database,” the analyst said. “This radically flexible new release could prove truly disruptive to many established market segments.”

Print this article