A Case For Ergonomics February 13, 2008
Posted by Tyson McDowell in Business, Technology, web3.0.Tags: Web4.0
2 comments
“Turning data into information.” It’s a marketing cliché, but that is because most people don’t actually understand what it means. Information means that somebody or something (in the case of a self-adjusting logical system) has come to a new understanding thanks to the analysis of data. But there is a rarely considered reality to the concept of information: it is not information if someone doesn’t come to understand it. It is just more data.
In my industry (healthcare revenue cycle software) there are “Business Intelligence” systems such as Medefinance, workflow systems such as Ontario Systems, and Business Rules Engines such as AHIQA. They all produce useful information, but few ensure that it is used. There are even aggregations of similar products, such as Accuro, but they don’t purvey a true portal over them all so that those components are not only seamless, but naturally interoperable.
The ergonomics of information delivery is the key to making information a reality, and that is the art yet unreleased in this age of massive data environments such as healthcare, internet traffic, and everything else.
Very few technologies ensure ergonomics as much as they ensure massively parallel processing, for example. Why? Because such ergonomics are an art and product into of themselves (look at Apple, they differentiate on ergonomics alone). The reality is that, until one combines this art of ergonomics with those of business intelligence, rules and workflow, you don’t produce information, and you don’t get action.
This is where I say, “Benchmark’s solution is the only one in the world to combine these factors into a unified environment.” Well, at least for healthcare it is. But even at “Benchmark, we are striving for a higher mark (pun intended) than even we might think possible.
The current power of technology systems simply hasn’t been fully realized yet because while they process effectively, they don’t communicate with you, the business person, effectively. Moore’s law considers computational efficiency, but what about the efficiency of thought communication? It does little good to process a billion bytes if no human or decision system will ever see or understand the result. Yet, that story plays out day after day with incredible monotony.
I challenge the tech-world to consider that we are deficient in our respect for the human user. We must remember that providing the infrastructure and ergonomic tools to easily structure information from widespread data is the only way to unlock the true power of the web. That, in my view, is the next technology revolution.
Brave New Scaling World February 10, 2008
Posted by Tyson McDowell in Technology, web3.0.Tags: excel over the web, mysql, oracle, shared-everything, shared-nothing, web2.0, web3.0
add a comment
It’s finally here… the great debate between shared-nothing and shared-everything architectures for databasing. Those of you in the know are probably thinking, “What’s with this guy? That debate has been raging since the early ’90s!” Well, what’s with me is that the world finally needs shared-everything, and shared-nothing physically can’t solve the processing problems of the future.
For a glimpse into this debate, from someone who happens to share my point-of-view, check out Kevin Closson’s Blog which relates to shared-nothing v. shared-everything.
Now, I totally agree with Kevin and have been pursuing a shared-everything architecture for Benchmark for several years. Why not just buy Oracle RAC? Coming from life in a start-up, I am used to free! Incidentally, MySQL can be configured to be a participant in a shared-everything architecture, but it needs outside management. Besides, even RAC, while shared-everything, doesn’t consider very important aspects of load-balancing that must leverage application-based context awareness (no database does, or really could. This is a matter of custom application interface, but that has to be bi-directional).
What I disagree with, though, it the basis of why shared-everything is important according to all of those who are proponents. Most of what I read on why shared-everything is the way of the future revolves around the issues of availability and scaled performance over a single application, not of multi-application request load balancing. Some background: Databases store data on disk, and “information” on that data in memory that is used to access the data faster. Since the data and information on the data are typically kept together, they must be in the same physical database server. Even if you have a detached hard-drive system, such as a NAS or SAN, only one database instance is allowed to access one set of data on the hard drives. This results in a situation where, if you need two database servers, you must copy the data and then synchronize it as transactions happen. You end up with double the storage need, plus additional synchronize commands, plus the data, going between all data servers in the cluster. The result is that, due to additional network traffic, you get incrementally less performance gain with each server added, and you didn’t actually scale your available storage even though you just upped your drive-space requirements.
What is multi-application request load balancing? It’s the need of the future. It’s the probability that Web3.0 applications will serve multiple unknown needs, and therefore unstructured analytic functionswill become the predominant processing demand in the server farms of our future while the same systems must still support OLAP-type requests. Why do I think this? Because it’s what I have to put up with every day at Benchmark!
Bottom line concept… “Excel Over the Web” (a term I first heard from Alistair Black, he may have coined it), over the whole Web. Ask any question, get the answer, based upon all the data out there. Semantic Web, Grid Computing, Drive Virtulization, Virtual Servers, Shared Processing Resources, Shared-Everything databasing, all of it, point to this future. But every one of these areas are full of super-smart specialists that don’t necessarily understand the other. And moreover, many of these super-smart specialists aren’t fully understanding the business demands that these systems will need to support.
The architectural concepts wrapped up in all of the ideas mentioned above also extend in to the application, and the operating infrastructure that commands all of these resources must it-self be transactionally aware, data domain aware, and self-optimizing. The applications must be written to be auto-parallelizing, where possible, so that they might implement generic interfaces on the processing farm that can then expect the transaction to be “scrubbed” into a category that implies the execution method required for that transaction. These are revolutionary thoughts for a data-base system architect, application architect, drive-array architect, file-system architect, and more.
Distilled down, the entire system must be capable of executing any transaction, on any data, on any processing head, period. And further, the transaction must be classified so that either a temporary dedicated processing environment can be “sprited” or the cluster can assume it into the mass, optimized by the transactional load-balancing method.
On that basis, business application developers can then write context-aware optimizations that can solve the performance and scaling problems associated with the Excel Over The Web (or at least Excel over massive amounts of data) concept. In the end, the Database domain of control will be reduced. Caching, Indexing, Clustering, stored-bytes access, and Optimization methods must dynamically configurable by the requesting application.
The Database has always been a black box, but that is slowly changing. For the future of the web, that box must be totally influencable, because only the application will know what it needs of it. For that to be possible, a shared-everything architecture is manditory.
The Data Currency February 6, 2008
Posted by Tyson McDowell in Business, Pontification, web3.0.Tags: Analysis, Data Warehouse, Google, web2.0, web3.0
add a comment
They say, “He who has the gold makes the rules.” Ever more so, data is the gold of our world today.
There is wild power that can be derived from data. Just look at Google’s discovery that probability-based deduction is more efficient at language translation than knowledge-based lexical and syntactic analysis. That means that Google has so much data, and so much computing power to crunch it, it is faster to simply compare translated texts to each other than it is to bother figuring out what the text means or how it is structured!
If data is gold, than those who have gold are banks. Banks only take a portion of the gold in exchange for holding and transacting the gold. A fine business, but the lion’s share of the market value is still in the gold, not the bank. The gold only realizes its value when someone spends it to create something new and useful.
Having tons of data at your disposal is certainly a path to money, and I know that is obvious to most of you. But what is not always understood is that the real value lies in the rationalization of that data more than in the data itself. In the Google example, it isn’t having all the books lying around in different languages that created the value; it is the resulting language algorithm that means something.
When you analyze data, you spend it to create new information and a method to act on it. At Benchmark, we spend data to drive work more efficiently to hospital administrators. This allows hospitals get more of the money that they are owed, thanks to information gleaned from data taken from all over the revenue cycle. At Bank of America, the data is spent to understand what kind of risk they can take on a person applying for a loan (at least, that was how they did it before sub-prime).
So many of the Web2.0 and Web3.0 companies are about becoming banks of data… or databanks… old term, new times. Yes, I know, you get advertising revenues from having lots of people, and advertisers can mine their use statistics to arrive at a CPM value. But that is merely a smidgeon of that data’s earning potential! Mass databanks coupled with ergonomic knowledge systems will resolve serious world issues and drive the final nail into the coffin for information inefficiency and guess-work, regardless of the scale of question to be answered.
Of course, such a utopia will never be fully realized, but we are on the verge of a prototype. Mass storage with Internet accessibility, the concept of (but as yet poor implementation of) semantic web, Google’s indexing of pages, and the ever-more-open architecture of the social-network infrastructure all lay the groundwork for a knowledge revolution.
It is exciting, but few seem to fully grasp the implications. Well, I know one thing for sure: healthcare will be one of the first to demonstrate this great power, so long as I continue to have something to do with it