Data Allocation Strategies

Data allocations are how one logical group of data gets spread across a destination data set, i.e. a group of applications which uses multiple servers (Apptio, 2015). According to ETL-Tools (n.d.), they state that a depending on the data allocation one can get different granularity levels. This can be a judgment call. and understanding your allocation strategy is vital for developing and understanding your data models (Apptio, 2015; ETL-Tools, n.d.).

The robustness and accuracy of the model depend on the allocation strategy between data sets, especially because the wrong allocation can create data fallout (Apptio, 2015). Data fallout is where data isn’t assigned between data sets. For instance, like how most SQL join (join left, join right, etc.) statements fail to combine every line of data between two data sets.

ETL-Tools (n.d.), stated that there are dynamic and fixed level granularity, however Apptio (2015), stated there can be many different levels of granularity. The following are some of the different data allocation strategies (Apptio, 2015; Dhamdhere, 2014; ETL-Tools, n.d.):

  1. Even spread allocation: data allocation where all data points are assigned the same allocation no matter what (i.e. every budget in the household gets the total sum of dollars divided by the number of budgets, regardless that the mortgage costs more than the utilities). It is the easiest to implement but its too overly simplified.
  2. Fixed allocation: data allocation based on data that doesn’t change, which stays constant (i.e. credit card limits). Easy to implement but the logic can be risky for data sets that can change over time.
  3. Assumption-based allocation (or manually assigned percentage or weights): data allocation based on arbitrary means or an educated approximation (i.e. budgets, but not a breakdown). Uses subject matter experts, but it is as good as the level of expertise making the estimates.
  4. Relationship-based allocation: data allocation based on the association between items (i.e. hurricane max wind-speeds and hurricane minimum central pressure). This can be easily understood, however, there may be some nuance that can be lost. In the given example there can be a lag between hurricane max wind-speeds and hurricane minimum central pressure, meaning a high correlation but still has errors.
  5. Dynamic allocation: data allocations based on data that can change off of a calculated field (i.e. tornado wind-speed to e-Fujita scale). Easily understood, unfortunately, it is still an approximation at a higher level of fidelity than lower levels of allocations.
  6. Attribute-based allocation: data allocations weighted by a static attribute of an item (i.e. corporate cell phone costs and data usage by service provider like AT&T, Verizon, T-mobile; Direct spend weighting of shared expenses). Reflects real-life data usage, but lacks granularity when you want to drill down to find the root cause.
  7. Consumption-based allocation: data allocation by measured consumption (i.e. checkbook line item, general ledgers, activity-based costing). Huge data sets needed, greater fidelity, but must be updated frequently.
  8. Multi-dimensional allocation: data allocation based on multiple factors. It could be the most accurate level of allocation for complex systems, it can be hard to understand from an intuitive level therefore not as transparent as a consumption-based allocation.

The higher the number, the more mature/higher the level of granularity of the data. Sometimes it is best to start at a level 1 maturity and work our way up to a level 8. Dhamdhere (2014), suggests that for best practice consumption-based allocation (i.e. activity-based costing) is a best practice when it comes to allocation strategies given its focus on accuracy. However, some levels of maturity may not be acceptable in certain cases (ETL-tools, n.d.). Please take into consideration what is the best allocation strategy for yourself, for the task before you, and the expectations of the stakeholders.

Resources:

Foul on the X-axis and more

There are multiple ways to use data to justify any story or agenda one has. My p-hacking post shows how statistics have been used to get statistically significant results. Therefore you can get your work to publish, and with journal articles and editors not glorifying replication studies, it can be hard to fund them. However, there are also ways to manipulate graphs to meet any narrative you want. Take the figure below, which was published by the Georgia Department of Public Health Website on May 10, 2020. Notice something funny going on in the x-axis, it looks like a Dr. Who’s voyage across time trying to solve the Corona Virus crisis. The dates on the x-axis are not in chronological order (Bump, 2020; Fowler, 2020, Mariano & Trubey, 2020; McFall-Johnsen, 2020, Wallace, 2020). The dates are in the order they need to be, to make it appear that the number of coronavirus cases in Georgia’s top 5 impacted counties is decreasing over time.

Figure 1: May 10 top five impacted counties bar chart from the Georgia Department of Public Health website.

The figure above, if the dates were lined up appropriately would tell a different story. Once this chart was made public, it garnered tons of media coverage and was later fixed. But, this happens all the time when people have an agenda. They mess with the axis, to give them the result they want. It is really rare though to see a real-life example of it on the x-axis.

But wait, there’s more! Notice the grouping order of the top five impacted counties. Pick a color, it looks like the Covid-19 counts per county are playing musical chairs. What was done here was, they ordered each day as top five counties in descending count order, which makes it even harder to understand and interpret, again sewing a narrative that may not be accurate (Bump, 2020; Fowler, 2020, Mariano & Trubey, 2020; McFall-Johnsen, 2020, Wallace, 2020).

Now according to Fowler (2020), there are issues in how the number of Covid-19 cases gets counted here, which adds to misinformation and sews further distrust. It is just another way to build a narrative you wish you had, but carving out an explicit definition of what is in and what is out, you can cause an artificial skew in your data, again to favor a narrative or produce false results that could be accidentally generalized. Here Fowler explains:

“When a new positive case is reported, Georgia assigns that date retroactively to the first sign of symptoms a patient had – or when the test was performed, or when the results were completed. “

Understanding that the virus had many asymptomatic carriers that never got reported is also part of the issue. Understanding that you could be asymptomatic for days and still have Covid-19 in your system, means that the definition above is completely inaccurate. Also, Fowler explains that if there was a Covid-19 test, there is such a backlog of tests, that it could take days to report a positive case, so reporting the last 14 days, these numbers along with the definition will see those numbers shift wildly throughout each iteration of the graph. So, when the figure one was fixed, the last 14 days will inherently show a decrease in cases, due to backlog, definition, and understanding of the virus, see figure 2.

Figure 2: May 19 top five impacted counties bar chart from the Georgia Department of Public Health website.

They did fix the ordering of the counties and the x-axis. But after it was reported by Fox News, Washington Post, and Business Insiders, to report a few. However, the definition of what counts as a Covid-19 case distorts the numbers and still tells the wrong story. It is easy to see this effect when you compare May 4-9 data between Figure 1 and Figure 2. Figure 2 has a higher incidence of Covid-19 recorded, over that same period. That is why definitions and criteria matter just as much as how graphs can be manipulated.

Mariano & Trubey (2020) does have a point, some errors are expected during a time of chaos, but, common chairmanship behavior should be observed. However, be careful of how data is collected, how it is represented on graphs and look at not only the commonly manipulated Y-axis but also the X-axis. That is why the methodology sections in peer-reviewed work are extremely important.

Resources:

LSAT Conditionals and CS Conditionals

In the past few months, I have been studying for the LSAT exam. Yes, I am contemplating Law School.  Law school will be a topic for another day.  However, I came across a few points that are extremely interesting and could spark discussion in the computer science field.  In the field of computer science, we have a thing called Loops in our coding languages.  One of the most common loops is called an IF-THEN loops, which is one of many conditional phrases. However, the LSAT has made me realized that there is more to the IF-THEN conditional statements in the LSAT, and here is why (Teti et al., 2013):
  1. If X then Y (Simple IF-THEN loop)
  2. If not Y then not X (This is the contra-positive of 1)
  3. X If and only if Y means X and Y
  4. X Unless Y means if not X then Y
where X here is the sufficient variable whereas Y is the necessary variable. The phrase “If” can be substituted for “All,” “Any,” “Every,” and “When” (Teti et al., 2013). Whereas the phrase for “then” can be substituted for the phrase “only,” or “only if.” Remember, that a conditional phrase like the ones above can introduce a relationship between the variables, but it doesn’t establish anything concrete. A sufficient variable (X) is enough to guarantee Y, but Y is not enough on its own to guarantee X.
Subsequently, with any Loop, we have to look at conjunctive “and” or disjunctive “or” statements.
  1. Both X and Y = X + Y
  2. Either X or Y = X or Y
  3. Not both X or Y = X or Y
  4. Neither X or Y = X + Y

We should note that an “or” statement can also allow for the possibility of both (Teti et al., 2013). Additionally, the LSAT adds some nuance to the conditional phrase by adding an “EXCEPT” clause.  For instance (Teti et al. 2013):

  1. Must be true EXCEPT = Could be false
  2. Could be true EXCEPT = Must be false
  3. Could be false EXCEPT = Must be true
  4. Must be false EXCEPT = Could be true
The LSAT views these loops, conjunctive, disjunctive, and conditional phrases a bit more nuance than computer scientists do and maybe we can combine some of this nuance in future coding to get more nuance code and results.
Though some people may state that this whole post is overkill and why do we have to look into such nuance. Each one of the above bullets is necessary and has value. It has been created in the lexicon for a particular reason. We can easily decompose each of these, and then map these out in simpler terms with a programming language. However, to sufficiently capture these nuance characteristics of these conditional phrases, we can create really nasty pieces of convoluted code.
Resources:
  • Teti, T., Teti, J., and Riley, M. (2013). The Blueprint for LSAT Logic Games. Blueprint LSAT Preparation.

To Do List: Home Network for Working Remotely

During 2020, with the rise of Corona Virus 2019 (Covid-19) we have seen a rise in working remotely. However, we soon realized that not everyone has the same connectivity speeds in their homes. We also have realized that there are internet connectivity deserts in the United States, where students from K-University may not have reliable access to the internet. Though this post is not going to address internet access and connectivity deserts in the U.S., it will address tips and techniques to help improve connectivity speeds in your home network. Even when people live in areas where internet connectivity can be taken for granted, connection speeds can be variable from home to home, which can impact performance for working remotely. A quick test is if you can stream Disney+, Netflix, Hulu, YouTube, or any other video streaming platform on your devices via a wireless network you should be able to work from home using WebEx, Zoom, etc. Essentially, 20 megabytes per second or greater will suffice. However, 20-30 megabytes per second is pushing it, especially if you are not single or have many other devices.

Your wireless connection is based on bandwidth and depending on the type of connection plan with your service provider that plan has a usually fixed amount, i.e. 20 megabytes per second, 50 megabytes per second, 1 gigabyte per second, etc. Therefore, it is imperative to squeeze every byte effectively. How many devices do you have connected to your network? For me, I have at least 8 devices on my network, with usually 5 connected to it at any one moment in time. Even though your cell phone is not streaming anything, it is still interacting with your network, consuming a small bit of your network’s bandwidth.

However, if I have 5 items connected to the network, I do not have them all streaming something at the same time. If we can do a crude extrapolation out from my anecdotal case, for a family of four, there could be about 20 devices connected to one network. What do we do? Consider budgeting and prioritizing streaming times, for a more cost-effective solution. However, depending on where you live you may be able to contact your service provider to increase your bandwidth. Also, check to see if your house or neighborhood has been wired up for Fiber or just DSL (fiber-optic connection is best as of the writing of the post) and switch if you are not on Fiber. Fiber connection allows for a higher connectivity speed.

Age plays a role in internet connectivity speeds, even if you have paid for the higher speeds from your provider. The older the house, the older the wiring to connect to the internet provider. With time, internet cable connections can degrade which can also impact performance. Also, the age of the router is important. The older the router, it may not be compatible with receiving higher speeds from your service provider. If you are renting one, contact your service provider to upgrade the router for free. If you have purchased one (recommended and more cost-effective), it may be time to upgrade it if it is old.

Even with newer routers, does it have the latest security patch? Computer viruses can impact your routers and degrade your connectivity speeds. It is always wise, regardless of your device to accept new upgrades and security patches. There is a caveat here, given the haste that some patches come out, I personally and typically wait some time depending on the need for a patch (if the current vulnerability is high the sooner I accept the patch), before I install one. Sometimes, security patches may cause more issues if installed in haste than warranted.

When you have set up wireless in your house, you may have dead spots or streaming bottlenecks. It is often best to test your connection speed by using apps a connection speed test website like www.speedtest.net. Start testing in the same room and nearest the router. There are two goals to testing near the router: (1) to test to see if your speed is within 5-10 megabytes per second to what you are paying for from your internet provider, (2) to set up a baseline of what you should be connecting to around the house. If you are not getting the right speed according to your contract, check the age and security patches on your router and retest again. If not, contact your service provider to address the situation. Once within the 5-10 megabyte per second variance from your contract, you should go around and test each room. You will see that as the further away from the router the connection speed may drop. You can also see that the connection in different rooms may vary significantly too. Different devices based on their age may also have a variable connection speed. A 6-year-old laptop may have slower connection speeds than the latest mobile device. 

To get the best solution when working from home is to install your router and wireless connection in the same room as your working station and if at all possible use the wired connection from your laptop to your router direct, bypassing wireless all together for your work device. Now, if you are testing the speed of your connectivity while connected via wire to the router is significantly faster than if it were connected via wireless, you then know that there may be an issue with the wireless. You can either buy a new wireless box if yours is old or troubleshoot from that point moving forward.

In summary, once you have addressed the above when working from home, reconsider the following to preserve your bandwidth: (1) limit as you can unnecessary video chatting via online meeting platforms; (2) limit streaming video on your work device; (3) limit streaming music on your work device; and (4) limit the number of devices connected and streaming data to your at-home network. This is because live video takes more data to stream than prerecorded video (from YouTube, Netflix, etc.) than streaming live voice over the network (Voice over internet protocol) than streaming music (Pandora, google music, etc.).

Parallel Programming: Threads

A thread is a unit (or sequence of code) that can be executed by a scheduler, essentially a task (Sanden, 2011). A single thread (task) will have one program counter and a sequence of code. Multi-threading occurs when one program counter shares a common code. Thus, the counter in multi-threading has many sequences of code that can be assigned to different processors to run in parallel (simultaneously) to speed up a task. Another way for multi-threading is to have the counter execute the same code on different processors with different inputs. If data is shared between the threads, there is a need for a “safe” object through synchronization, where one thread can access the data stored in a “safe” object at one time. It is through these “safe” objects that a thread can communicate with another thread.

An additional example that may help illustrate the material: 

Maybe we would like to know the average of the sum of all the credits and the average of the sum of all the debits made in personal checking accounts in December in Suntrust Bank. After Map-Reduce techniques using multiple threading, we can go through their entire database system to find accounts and timestamp transactions, map out all the data and reduce it to what we need to return the two numbers in our query. 

Resources:

Adv DB: Web DBMS Tools

Developers need tools to design web-DBMS interfaces for dynamic use of their site for either e-commerce (Amazon storefront), decision making (National Oceanographic and Atmospheric Administration weather forecast products), or forgather information (Survey Monkey), etc.  ADO.NET and Fusion Middleware are two of many tools and middleware that can be used to develop web-to-database interaction (MUSE, 2015).

ADO.NET (Connolly & Begg, 2014)

Microsoft’s approach to a web-centric middleware for the web-database interface, which provides compatibility with .NET class library, support to XML (used excessively as an industry standard), and connection/disconnection data access.  It has two tiers: dataset (data table collection, XML) and .NET Framework Data Provider (connection, command, data reader, data adapter, for the database).

Pros: Built on standards to allow for non-Microsoft products to use it.  Automatically creates XML interfaces for the application to be turned into a Web Operable Service.  Even the .NET classes conform to XML and other standards.  Other development tools for further expanding the GUI set can be added and bound to the Web Service.

Cons: According to the Data Developer Center website (2010),  with connected data access, you must explicitly manage all database resources, and not doing so can cause resource mismanagement (connections are never freed up).  Other functions in certain classes are missing, like mapping to table-valued functions in the Entity Framework.

Fusion Middleware (Connolly & Begg, 2014):

Oracle’s approach to a web-centric middleware for the web-database interface, which provides development tools, business intelligence, content management, etc.  It has three tiers: Web (using Oracle web cache and HTTP Server), Middle Tier (apps, security services, web logic servers, other remote servers, etc.), and data (the database).

Pros: Scalable. It is based on a Java Platform (full Java EE 6 implementation).  Allows Apache modules like those that route HTTP Requests, for store procedures on a database server, for transparent single sign-on, SHTTP, etc. Their Business Intelligence function allows you to extract and analyze data to create reports and charts (statically or dynamically) for decision analysis.

Cons: The complexity of their system along with their new approach creates a steep learning curve, and requires skilled developers.

The best approach for me was Microsoft: If you want to connect to many other Microsoft applications, this is one route to consider.  It has a nice learning curve (from personal experience).  Another aspect, was when I was building apps for the Library at the University of Oklahoma, the DBAs and I didn’t really like the grid view basic functionalities, so we exploited the aforementioned pro of interfacing with third-party codes, to create more interactive table view of our data.  What is also nice is that our data was on an Oracle database, and all we had to do was switch the pointer from SQL to Oracle, without needed to change the GUI code.

Resources

Plagiarism: A word

The following article found on https://www.econtentpro.com/blog/, talks about abuses that can lead to various forms of plagiarism.  eContent Pro (2019) is a really great article showcasing that there is more than one way to plagiarise.  However, they did not provide examples to showcase each case, nor explained the nuance in case 2 all that well (eContent Pro, 2019):

  1. Self-plagiarism
  2. Overreliance on Multiple Sources
  3. Patchwriting
  4. Overusing the same source

The following is my attempt to do just that.

Example of Self-plagiarism

If I were to use the following two paragraphs verbatim in a new paper or as a book chapter … even though these are my words from Hernandez (2017a), it is considered self-plagiarism.  It is good to recycle your work cited page, it is not good to recycle your words, just like you would recycle plastic bottles.

Chapter 1: An Introduction to Data Analytics

Data analytics has existed before 1854. Snow (1854) had a theory on how cholera outbreaks occur, and he was able to use that theory to remove the pump handle off of a water pump, where that water pump had been contaminated in the summer of 1854. He had set out to prove that his hypothesis on how cholera epidemics originated from was correct, so he then drew his famous spot maps for the Board of Guardians of St. James’ parish in December 1854. These maps were showed in his eventual 2nd edition of his book “On the Mode of Communication of Cholera” (Brody, Rip, Vinten-Johansen, Paneth, & Rachman, 2000; Snow, 1855). As Brody et al. (2000) stated, this case was one of the first famous examples of the theory being proven by data, but the earlier usage of spot maps has existed.

However, the use of just geospatial data analytics can be quite limiting in finding a conclusive result if there is no underlying theory as to why the data is being recorded (Brody et al., 2000). Through the addition of subject matter knowledge and subject matter relationships before data analytics, context can be added to the data for which it can help yield better results (Garcia, Ferraz, & Vivacqua, 2009). In the case of Snow’s analysis, it could have been argued by anyone that the atmosphere in that region of London was causing the outbreak. However, Snow’s original hypothesis was about the transmission of cholera through water distribution systems, the data then helped support his hypothesis (Brody et al., 2000; Snow 1854). Thus, the suboptimal results generated from the outdated Edisonian-esque, which is a test-and-fail methodology, can prove to be very costly regarding Research and Development, compared to the results and insights gained from text mining and manipulation techniques (Chonde & Kumara, 2014).

Example of Overreliance on Multiple Sources

The following was taken from my Dissertation (Hernandez, 2017b).  There is definitely an overreliance on sources here. As with any dissertation, master’s thesis, or even interdisciplinary work. However, my voice still shines through. That is where the line is drawn by eContent Pro (2019): Is the author’s voice still present?

In this excerpt, it shows how I gathered multiple methodologies, from multiple sources and combined them all to form a best practice for data preprocessing. Another word for this process is called Synthesizing. Not one source had all the components, and listing which source contained which parts of the best practice methodologies was the purpose of these three paragraphs.  If my voice wasn’t present in these paragraphs, then it would be considered plagiarism.

Collecting the raw and unaltered real world data is the first step of any data or text
mining research study (Coralles et al., 2015; Gera & Goel, 2015; He et al., 2013; Hoonlor, 2011; Nassirtoussi et al., 2014). Next, preprocessing raw text data is needed, because raw text data files are unsuitable for predictive data analytics software tools like WEKA (Hoonlor, 2011; Miranda, n.d.). Barak and Modarres (2015), Miranda (n.d.), and Nassirtoussi et al. (2014) concluded that in both data and text mining, data preprocessing has the most significant impact on the research results.

Raw data can have formats that change across time, therefore converting the data into one common format for analysis is necessary for data analytics (Mandrai & Barkar, 2014). Also, the removal of HTML tags from web-based data sources allows for the removal of extraneous data points that can provide unpredictable results (Netzer et al., 2012). Finally, deciding on a strategy about how to deal with missing or defective data fields can aid in mitigating noise from the results (Barak & Modarres, 2015; Fayyad et al., 1996; Mandrai & Barskar, 2014; Netzer, 2012). Furthermore, to gain the most insights surrounding a research problem, data from multiple data
sources should be collected and integrated (Corrales et al., 2015).

Predictive data analytics tools can analyze unstructured text data after the preprocessing step. Preprocessing involves tokenization, stop word removal, and word-normalization (Hoonlor, 2011; Miranda, n.d.; Nassirtoussi et al., 2014; Nassirtoussi et al., 2015; Pletscher-Frankild et al., 2015; Thanh & Meesad, 2014). Tokenization is when a body of text is reduced to a set of units, phrases, or groups of keywords for analysis (Hoonlor, 2011; Miranda, n.d.; Nassirtoussi et al., 2014; Nassirtoussi et al., 2015; Pletscher-Frankild et al., 2015; Thanh & Meesad, 2014). For
example, the term eyewall replacement would be considered one token, rather than two words or two different tokens. Stopword removal is the removal of the words that add no value to the predictive analytics algorithm from the body of text; these words are prepositions, articles, and conjunctions (Hoonlor, 2011; Miranda, n.d.; Nassirtoussi et al., 2014; Nassirtoussi et al., 2015; Thanh & Meesad, 2014). Miranda (n.d.) stated that sometimes stop-word removals could also be context-dependent because some contextual words can yield little to no value in the analysis. For instance, meteorological forecast models in this study are considered context-dependent stopwords. Lastly, word-normalization transforms the letters into a body of text to one single case type and removes the conjugations of words (Hoonlor, 2011; Miranda, n.d.; Nassirtoussi et al., 2014; Nassirtoussi et al., 2015; Thanh & Meesad, 2014). For example, stemming the following words cooler, coolest, and colder becomes cool-, which heightens the fidelity of the results due to the reduction of dimensionalities.

Example of Pathwriting and overusing the same source

This self-created meta-post for this post, which happens to be a curation post for Service Operations KPIs and CSF. The words below have been lifted from various sections of:

Each sample Critical Success Factors (CSFs) is followed by a small number of typical Key Performance Indicators (KPIs) that support the CSF. These KPIs should not be adopted without careful consideration. Each organization should develop KPIs that are appropriate for its level of maturity, its CSFs and its particular circumstances. Achievement against KPIs should be monitored and used to identify opportunities for improvement, which should be logged in the CSI register for evaluation and possible implementation.

Service Operations: Ensures that services operate within agreed parameters, when it’s interrupted they restore services ASAP 

Request Fulfillment Management: Request Fulfillment is responsible for

  • Managing the initial contact between users and the Service Desk.
  • Managing the lifecycle of service requests from initial request through delivery of the expected results.
  • Managing the channels by which users can request and receive services via service requests.
  • Managing the process by which approvals and entitlements are defined and managed for identified service requests (future).
  • Managing the supply chain for service requests and assisting service providers in ensuring that the end-to-end ddelivery is managed according to plan.
  • Working with the Service Catalog and Service Portfolio managers to ensure that all standard service requests are appropriately defined and managed in the service catalog (future).

 

  • CSF Requests must be fulfilled in an efficient and timely manner that is aligned to agreed service level targets for each type of request

o    KPI The mean elapsed time for handling each type of service request

o    KPI The number and percentage of service requests completed within agreed target times

o    KPI Breakdown of service requests at each stage (e.g. logged, work in progress, closed etc.)

o    KPI Percentage of service requests closed by the service desk without reference to other levels of support (often referred to as ‘first point of contact’)

o    KPI Number and percentage of service requests resolved remotely or through automation, without the need for a visit

o    KPI Total numbers of requests (as a control measure)

o    KPI The average cost per type of service request

  • CSF Only authorized requests should be fulfilled

o    KPI Percentage of service requests fulfilled that were appropriately authorized

o    KPI Number of incidents related to security threats from request fulfilment activities

  • CSF User satisfaction must be maintained

o    KPI Level of user satisfaction with the handling of service requests (as measured in some form of satisfaction survey)

o    KPI Total number of incidents related to request fulfilment activities

o    KPI The size of current backlog of outstanding service requests.

Incident Management: Incident Management is responsible for the resolution of any incident, reported by a tool or user, which is not part of normal operations and causes or may cause a disruption to or decrease in the quality of a service.

  • CSF Resolve incidents as quickly as possible minimizing impacts to the business

o    KPI Mean elapsed time to achieve incident resolution or circumvention, broken down by impact code

o    KPI Breakdown of incidents at each stage (e.g. logged, work in progress, closed etc.)

o    KPI Percentage of incidents closed by the service desk without reference to other levels of support (often referred to as ‘first point of contact’)

o    KPI Number and percentage of incidents resolved remotely, without the need for a visit

o    KPI Number of incidents resolved without impact to the business (e.g. incident was raised by event management and resolved before it could impact the business)

  • CSF Maintain quality of IT services

o    KPI Total numbers of incidents (as a control measure)

o    KPI Size of current incident backlog for each IT service

o    KPI Number and percentage of major incidents for each IT service

  • CSF Maintain user satisfaction with IT services

o    KPI Average user/customer survey score (total and by question category)

o    KPI Percentage of satisfaction surveys answered versus total number of satisfaction surveys sent

  • CSF Increase visibility and communication of incidents to business and IT support staff

o    KPI Average number of service desk calls or other contacts from business users for incidents already reported

o    KPI Number of business user complaints or issues about the content and quality of incident communications

  • CSF Align incident management activities and priorities with those of the business

o    KPI Percentage of incidents handled within agreed response time (incident response-time targets may be specified in SLAs, for example, by impact and urgency codes)

o    KPI Average cost per incident

  • CSF Ensure that standardized methods and procedures are used for efficient and prompt response, analysis, documentation, ongoing management and reporting of incidents to maintain business confidence in IT capabilities

o    KPI Number and percentage of incidents incorrectly assigned

o    KPI Number and percentage of incidents incorrectly categorized

o    KPI Number and percentage of incidents processed per service desk agent

o    KPI Number and percentage of incidents related to changes and releases.

Problem Management: Problem Management is responsible for the activities required to

  • Diagnose the root cause of incidents.
  • Determine the resolution to related problems.
  • Perform trend analysis to identify and resolve problems before they impact the live environment.
  • Ensure that resolutions are implemented through the appropriate control procedures, especially change management and release management.

Problem Management maintains information about problems and appropriate workarounds and resolutions to help the organization reduce the number and impact of incidents over time. To do this, Problem Management has a strong interface with Knowledge Management and uses tools such as the Known Error Database.

  • CSF Minimize the impact to the business of incidents that cannot be prevented

o    KPI The number of known errors added to the KEDB

o    KPI The percentage accuracy of the KEDB (from audits of the database)

o    KPI Percentage of incidents closed by the service desk without reference to other levels of support (often referred to as ‘first point of contact’)

o    KPI Average incident resolution time for those incidents linked to problem records

  • CSF Maintain quality of IT services through elimination of recurring incidents

o    KPI Total numbers of problems (as a control measure)

o    KPI Size of current problem backlog for each IT service

o    KPI Number of repeat incidents for each IT service

  • CSF Provide overall quality and professionalism of problem handling activities to maintain business confidence in IT capabilities

o    KPI The number of major problems (opened and closed and backlog)

o    KPI The percentage of major problem reviews successfully performed

o    KPI The percentage of major problem reviews completed successfully and on time

o    KPI Number and percentage of problems incorrectly assigned

o    KPI Number and percentage of problems incorrectly categorized

o    KPI The backlog of outstanding problems and the trend (static, reducing or increasing?)

o    KPI Number and percentage of problems that exceeded their target resolution times

o    KPI Percentage of problems resolved within SLA targets (and the percentage that are not!)

o    KPI Average cost per problem.

Event Management: These processes have planning, design, and operations activity. Event Management is responsible for any aspect of Service Management that needs to be monitored or controlled and where the monitoring and controls can be automated. This includes:

  • Configuration items.
  • Environmental controls.
  • Software licensing.
  • Security.
  • Normal operational activities.

Event Management includes defining and maintaining Event Management solutions and managing events.

  • CSF Detecting all changes of state that have significance for the management of CIs and IT services

o    KPI Number and ratio of events compared with the number of incidents

o    KPI Number and percentage of each type of event per platform or application versus total number of platforms and applications underpinning live IT services (looking to identify IT services that may be at risk for lack of capability to detect their events)

  • CSF Ensuring all events are communicated to the appropriate functions that need to be informed or take further control actions

o    KPI Number and percentage of events that required human intervention and whether this was performed

o    KPI Number of incidents that occurred and percentage of these that were triggered without a corresponding event

  • CSF Providing the trigger, or entry point, for the execution of many service operation processes and operations management activities

o    KPI Number and percentage of events that required human intervention and whether this was performed

  • CSF Provide the means to compare actual operating performance and behaviour against design standards and SLAs

o    KPI Number and percentage of incidents that were resolved without impact to the business (indicates the overall effectiveness of the event management process and underpinning solutions)

o    KPI Number and percentage of events that resulted in incidents or changes

o    KPI Number and percentage of events caused by existing problems or known errors (this may result in a change to the priority of work on that problem or known error)

o    KPI Number and percentage of events indicating performance issues (for example, growth in the number of times an application exceeded its transaction thresholds over the past six months)

o    KPI Number and percentage of events indicating potential availability issues (e.g. failovers to alternative devices, or excessive workload swapping)

  • CSF Providing a basis for service assurance, reporting and service improvement

o    KPI Number and percentage of repeated or duplicated events (this will help in the tuning of the correlation engine to eliminate unnecessary event generation and can also be used to assist in the design of better event generation functionality in new services)

o    KPI Number of events/alerts generated without actual degradation of service/functionality (false positives – indication of the accuracy of the instrumentation parameters, important for CSI).

Access Management: Access Management aims to grant authorized users the right to use a service, while preventing access to non-authorized users. The Access Management processes essentially execute policies defined in [[IT Security Management |Information Security Management]]. Access Management is sometimes also referred to as ”Rights Management” or ”Identity Management”.

  • CSF Ensuring that the confidentiality, integrity and availability of services are protected in accordance with the information security policy

o    KPI Percentage of incidents that involved inappropriate security access or attempts at access to services

o    KPI Number of audit findings that discovered incorrect access settings for users that have changed roles or left the company

o    KPI Number of incidents requiring a reset of access rights

o    KPI Number of incidents caused by incorrect access settings

  • CSF Provide appropriate access to services on a timely basis that meets business needs

o    KPI Percentage of requests for access (service request, RFC etc.) that were provided within established SLAs and OLAs

  • CSF Provide timely communications about improper access or abuse of services on a timely basis

o    KPI Average duration of access-related incidents (from time of discovery to escalation).

Resources:

Unconventional Hurricane Prep

We are at the height of Hurricane Season again for the United States. Although, we have hurricane preparation lists from multiple websites.  I think I like to share some unconventional items:

To do:

  1. As bad as a Hurricane is, it is a fantastic way to really meet your neighbors.
  2. Fill up your bathtubs with water. If there is no water after, you can put a bucket out of the tub in the back of your toilet so you can flush.
  3. Fill up a cup of water and put it in the freezer to freeze. Then put a coin on the top of the frozen water. If you come back from evacuating and don’t know if you lost power for a while check the cup. If the coin is frozen to the bottom of the cup you know the food defrosted and refroze when the power came back on. Throw out the food. If the coin is still on top your food is fine.
  4. Also, fill your Tupperware with water and freeze it. It can be used after the storm in coolers to keep food and drinks cold. Ice will be precious!
  5. Fill up coolers with water for drinking after the storm because the water after the storm is contaminated and may have to be boiled before use.
  6. Make sure nothing is left outside that can hit the house. Pull all outside furniture, bird feeders, etc. into the garage. Most become flying objects left outside. If you can’t bring trash cans or recycling in, let them fill with water. Make sure they are empty before the storm hits.
  7. Don’t buy hurricane snacks too early because you will eat them before it hits.
  8. Fill up all cars with gasoline or diesel prior to evacuating or staying put.  You never know when the next shipment of gas will come in nor how much more expensive it will be, because of low supply and high demand. Save your gas after the storm. If you must sight-see, use a bicycle.
  9. Close all doors in the home. Especially if you evacuate. If you lose a window, that should confine water damage.
  10. Metal Garage doors, especially the 9 to 10 Ft variety, can’t handle strong winds. they will collapse inward. Either park in front of them or back a car up against them from the inside. Don’t forget the heavy blanket between the door and the car.
  11. Always assume a downed power line may be present in standing water.
  12. If you lose power and if you have any solar-powered lights, bring some inside to light the house at night and back in the sun during the day. It saves batteries!
  13. If you’re evacuating, don’t forget to unplug any electrical items that you can, eg TV, router, desktop pc, etc. If your area loses power, there could be a surge when your power company is trying to bring folks back online. Most surge protectors will help, but it’s usually better to be safe than sorry. Also, when cutting areas back online there may be power “blips” (on then off really quick). It’s best to wait until the power stabilizes to plug stuff back in.
  14. Charge all your electronics and turn them off before the storm.  This includes computers, cellphones, etc.
  15. Make sure all your dishes and laundry are clean. It might be a week before the power comes back on. In Miami, for Hurricane Wilma, I was without power for almost a month.
  16. If you have no landline for the phone, arrange to work with someone that does. It can take weeks if not a month after a storm before all cell towers were realigned.

Do this every year prior to the storm season

  1. Check your trees for dead limbs. Service them if you can.
  2. If you have a Generator or Chain saw, service it now. Make sure it is ready to go.
  3. Check your grille. You’ll need gas or lots of wood/ charcoal for cooking. Never use a gas grill indoors.
  4.  Stock your and your family’s medical needs as well.

If you have pets

  1. Fill up a clean large plastic bin with water for my dog.
  2. Of course, all pets should be brought in.
  3. With all the noise from wind and rain, if your pet is crate trained it will help keep her calm to go into her crate.
  4. If you’re evacuating and you have a To-Go bag, make one for your pet – medicines, collar and leash with id, crate, towel(s), blanket or bed, favorite toy, treats, food, and water; also, any pertinent medical records (shots, medical history). Take your pet with you.
  5. Should the worse occur and your pet gets out and lost, having her microchipped will help ensure she gets home. VIP Petcare Clinic will chip them for about ~$19.  After a storm, many pet rescue groups will come in to pick up “strays”. If your pet has no microchip, they could pick her up and your pet could end up being adopted out to someone hundreds or even thousands of miles away.

Again these are unconventional advice to be followed along with conventional advice. However, this advice is just as valuable.

Database Management: SQL Joins

Please note that the following blog post provides a summary view for what you need to get done (left column) and quick examples that illustrate how to do it in SQL (right column with SQL code in red). For more information please see the resources below:

Equijoins
SELECT e.ename, e.deptno,  d.deptno, d.name
  FROM emp e INNER JOIN dept d
  ON e.deptno = d.deptno
Non-Equijoins
SELECT e.ename, e.sal,  s.grade
  FROM emp e INNER JOIN salgrade s
  WHERE e.sal
  BETWEEN  s.losal  AND  s.hisal

From:
grade      losal        hisal
-----      -----        ------
1            700        1200
2           1201        1400
3           1401        2000
4           2001        3000
5           3001        9999

Gives the following solution:
ename           sal     grade
----------   --------- ---------
JAMES            950         1
SMITH            800         1
ADAMS           1100         1
Outer joins
SELECT e.ename, e.deptno,  d.deptno
  FROM emp e RIGHT JOIN dept d
  ON e.deptno = d.deptno

SELECT e.deptno,  d.deptno, d.name
  FROM emp e LEFT JOIN dept d
  ON e.deptno = d.deptno
Self Joins
SELECT worker.ename +’ works for’+ manager.ename
  FROM emp worker, emp manger
  ON worker.mgr = manager.empno

 

A Letter of Gratitude to Dr. Shaila Miranda

Dr. Shaila Miranda has taught me that I am the author of my story. I have known Dr. Shaila Miranda for many years. During this period, she has outperformed as a mentor and an educator. Throughout my two years at the University of Oklahoma, Dr. Miranda has taken the initiative to know her students on a personal level. I first met Dr. Miranda at a riveting presentation she gave at the M.B.A. Program Prelude Week. After further interactions with Dr. Miranda, she inspired me to seek a dual-masters-degree, M.B.A. and M.S. in M.I.S rather than the traditional M.B.A degree. Dr. Miranda helped me realized my hidden passion for information systems [technology]. It takes an exceptional mentor to recognize and instill a vision so powerful that it can alter the course of a mentee.

A few semesters after our original meeting, she learned about a non-profit I was about to start. She saw how I leveraged social media to forward the cause. This inspired her to become a Sooner Ally, and other M.I.S. faculty followed suit. This demonstrates the passion and the conviction as an outstanding educator. Dr. Miranda is willing to listen, learn, and act based on her interactions with students just as much as she is willing to support them. She was demonstrating social awareness and became a model professor for those other professors in the department but model inclusive behavior to her students.

As one of her students, I was completely engaged in the course she was teaching. Her curriculum was remarkable, her lectures and active learning with real world data gave the class an invaluable insight. Dr. Miranda’s passion and commitment towards education could be seen throughout the semester when she sought employees from Devon, and other local companies, to help facilitate our education. This was her demonstrating managing relationships, which has allowed her to educate her students at a deeper level.

Her commitment to her students did not end at the end of the term. This was evident when she nominated me to represent the University of Oklahoma at the Information Systems and Walmart IT Summit. She coached the students individually, and as a group, to give us a competitive edge in the competition. As if that was not enough, her commitment to her students is so vast that she drove the team to Arkansas and attended our presentations with a video recorder at hand. It was with her lessons that took our team to 3rd place in the Walmart IT Summit Competition. She made us self-aware of ourselves and our surroundings; this is what gave us the competitive edge.

As graduation neared, she arranged mock interviews helped me land two job offers. Upon receiving both job offers, she assisted me in the decision-making process. She engaged my self-awareness and self-management sides of emotional intelligence to help me make the right decision. It was that decision, that got me the job I have now, that has allowed me to attend Colorado Technical University, to finally complete the doctorate. Thus, words cannot express how grateful I am for this outstanding educator. She got to know me as an individual, mentored me, and made me who I am today.

She had believed in me when others didn’t, and for that, I am grateful for it. She developed me into the person I am today, and she even provided me a key piece of advice towards my dissertation (the tool I eventually used to analyze my data), and she wasn’t even in the same school nor in my committee. She was still managing her relationships with me, beyond the years of completing my education in that department. She shows me that the boundaries of mentorships and relationships exist outside of an organization and traverses time. This is what I can learn from her, to believe in people that you lead.