my Offline Web App – applicationCaching and manifest files in html5. IT WORKS TOO!

My Need: I tried to make my own iOS app that would take html, scrape the links, get and store the data, change the links and cache a new html file.

OK – as usual – someone already solved that problem (it always happens).

Here are some AWESOME links to help me build my solution – I am SURE you will find them beyond helpful like I did..

Why this app: Something so I can not have to count on my wifi or 3G when reading parent/kid teaching material because we might be camping, or most of the time- JUST out of reach of the wifi and we are lying in bed. 3 year old don’t have patience for you to get in range, start up facebook … again type your group name in only to find that the links on the wall are outdated. I have 3 kids in 3 different age groups. I need something that is JUST GOING TO WORK when I need it to work. And it does – nice.

Design thoughts, steps and resources that you can use too: I want an app that is offline – well html5 has a solution already AND IT WORKS and there are some really smart guys with a lot of good material. Let me bookmark the ones that got me going

  1. installed a wordpress site
  2. got a blank or simple html5 template that I could hack (or build into is what I actually did). I read this book a while back.
  3. in my plugin in I added some code to call my new template and I requested a manifest file with the &manifest=1 GET parameter
    1. this then calls the single.php page which, when it sees the &manifest=1 goes and
    2. calls a new header called header-manifest.php and I=
    3. make the template also redirect to single-manifest.php which together
    4. spits out the manifest headers with[html]Content-type: text/cache-manifest[/html]
    5. Read more about what to spit out with these links
      1. Offline Web Applications – Dive Into HTML5
      2. How to create offline webapps on the iPhone | The CSS Ninja – All things CSS, JavaScript & HTML
      3. This is start to my app which uses what these guys do with javascript so you can SEE or understand what is going on. THESE GUYS ROCK – the client javaScript side needs to be working to monitor progress for your users. Read the comments below and you can see you only need to do an ajax query to the local cached version of your html manifest file (that is cool too) so you don’t have to fetch 3 times from the server
        1. fetch 1 – the page
        2. fetch 2 – the same page but ransacked for all the pdf links and other resources to cache
        3. fetch 3 – not required any longer due to buddy’s comment in the above link – but a pull of the manifest file to do some regex’s and get the # of manifest items so you can see your progress.
      4. To include jQuery – use this link from google
      5. Some stuff from Apple Safari Web Content Guide: Storing Data on the Client
        1. apple  / iOS has some issues with putting the UIWebView and getting the app cache to cache all files. I think I need to increase the cache. For now, I have it opening up Safari and it works great. Here is what might work: Link1, Link2, Link3. These guys seem to have got it to work – I need to get my app out and working.
    6. It works and works great. Click on this link and see the top data (my stuff) loading and the app cache stuff loading live. At the time of writing this, the bottom of the page is quite texty. I need to jQuery it to get it snazzy and graphical.
    7. Turn off the wireless (enter the airplane mode) and keep browsing the links. AMAZING!

      This is the file that gets scraped for the links into a manifest file which is a list of the resources that get cached - offline. Pics too.

      This is the file that gets scraped for the links into a manifest file which is a list of the resources that get cached – offline. Pics too.

The geekery that makes it work - the text version. jQuery to come

The geekery that makes it work – the text version. jQuery to come

Reachability and SimplePing much improved and useable on iOS

I will do my best to give everyone credit where credit is due – I have looked at so many websites. 

ReachabilityWithSimplePingMuchImprovedIs your app waiting for the network or an online resource and are you ensuring your users know that there is something going on? If you don’t take care of this or give your users an idea of what the phone is doing – or if it is doing anything – the Apple app store apparently could reject your code and some authors out there… have been rejected and you put at the end of the line to resubmit.

So – the app I am making to load webcams for an in-house preschool security system, that is ONLY supposed to work inside the wifi network. Here is a GREAT resource to start with – you can see if www.apple.com is available. BUT … I wanted to know if the IP address 192.168.1.12 (a local LAN ip address) was available – then I know that it is the correct location. The Apple Reachability example AND a MUCH better one from www.ddg.com have a method called reachabilityWithAddress: but it doesn’t work – it says everything is available with ANY I.P. address! Ideally I want to know if the network IP address of the iphone is the same as that network (see WhatsMyIP) but that is not in my camera app yet.

I upgraded www.ddg.com’s code and converted it to ARC compatible code (refactored it) and modernized it a bit for iOS 6.1.

But the biggest thing, is that I added and wrapped Apple’s SamplePing to it as well – since that gets me over the hump. I have a question into the fellow that made Reachability better to see if he had insight to the native reachabilityWithAddress: .

It also has a method to set how many times it Ping’s since SamplePing never stops and how long to retry those subsequent pings. There is a button next to the ping that allows you  to re-ping. It works when the wifi changes, when cellular changes, when VPN changes – all sorts of new information available with a utility to help you code your own app instead of guessing what might be working or not.

You can edit the ip address live and it resets/retries everything again X times – so in fact it is a useful tool for when you go ‘on the road’ to see what flags etc. and conditions your new app sees. In my case, I need to know if the wifi is live, and if I can ping a certain IP address. Then I can let my users know with an activity indicator, what is going on and why the camera’s do not load – because most of the time they should not load. When your at the customer’s location, the cameras should load. Outside or using cellular 3G/4G or cellular data – it should tell them that things will not work  and how to obtain the guest ID of the customer’s wifi network.

Interesting points and links I used along the way that are not already listed:

  • SimplePing locks the phone for up to a minute if you continuously ping without waiting for a response. Let it time out before you retry AND run that code in a different thread. New concepts exist that we can take advantage of like closures in Groovy.  Here is “Grand Central Dispatch” tutorial #1 and #2 that I used
  • Githib and XCode – I have little hair to spend on this. Spend the $$ and download Tower for Mac. Ahhh – thats better.
  • apple’s sockets and timeouts
  • another good article about reachability that helped me understand

Here is the code on git. Send me a line if you end up re-using it – it is nice to know who and where on the Globe one’s stuff gets to!

“So what language do you use?” – a pet peeve question

The title is meant to be catch your attention – and even I sometimes ask that question. It bothers me because people are so much more than just a programmer – but a solutions provider and go waaaaaay beyond technical know. Soft skills learned in the leadership masters degree for example, is primarily people related and affects the bottom line even more pronounced than the technical details. Similarly, when people ask “what language do you program in” . . . the answer is waaaaay beyond a specific one – but is is the solution provided that matters. Therefore the correct answer is “whatever language gets the job done and whatever it takes to glue things together … for you!”

Below is a personal experience with each language and it also lists the familiarity with them.

My mainstays seem to be php, .net (vb and c#) and Filemaker but lately I have done so much groovy and Java development that that is now second nature!

PHP – I can’t count how many thousands of lines I have coded. All my web development mainly uses this. This is a simple language understand and there are hundreds of people deeply embedded over the past 10+ years.

.net (vb and c#) was my mainstay while contracting to Telvent (now Schneider Electric) in their screen developement. Dozens of time saving stand-alone tools were developed as well as they main job- designing .net based screens to control pipelines and high power electrical systems (those were my tasks). Other .net jobs was the parent paging tool developed to integrate a check in system which is been in use since 2004. Lately a tool to interface with a counter-top inventory tracking system was developed in .net and runs nightly to scrape data and process material consumption via .net API and another to get customer data from another database.

Groovy – it took me a little time to get my head around groovy and a friend asked “so how do you like crawling again?” Within a week of forcing myself to learn it well – I was up and running. You can do so much in a few lines of groovy – it is I think an advanced language. Using Java (which groovy runs on) was so easy to learn as it is just like C. Advice is to learn it’s terminology through this great tutorial available at Oracle.

Java. Within a week I was as fluent with this as Cocoa, or .net. It is a very structured language with little room for making mistakes. A lot of custom code as been written for the Bonita platform and Java ties directly into it since Bonita is written in Java and uses groovy scripting to complete the rest.

Bonita. This can be considered a progamming language of sorts. Most of the flow is done visually. Most of the nitty gritty is done in groovy and/or Java.

Filemaker. I hve written extensive administration systems in Filemaker – hundred of scripts integrated with dos and unix scripts for doing things like ftp, xml manipulation etc. This is beyond thinking in terms of tables. You use the scripts to drive nagivation, decisions and to automate many tasks in Filemaker. It is a database system that anyone can understand.

C and C++. I have learned this when I was 14 – grabbed a book and started to go. I had to use object oriented concepts far after they were developed in 2004. The whole idea about asynchronous clicking by the user forces good programmers to think proactively about ‘what could go wrong’.

Cocoa – iOS and Mac OS X development. Steve Jobs was a genious (he invented it). This language is basically C++ but the structure is developed so that you can design fast. It is tricky but I am now in the cool club. A project for a dashboard app, a crossword puzzle, a secutity camera viewer app, a classroom coordinator app (a multimedia player over multiple machines) and a parent page viewer have all been developed using the Cocoa language.

Applescript – the most english language version of programming I have ever done. I have automated many processes to allow files to be transferred, things to be uploaded etc. Teen agers were able to do a whole secretaries job my moving this and that process ahead with automation. Many scripting languages use applescript and it fills the gap. Put this together with Automator and you can do just about anything.

VBA – One might think this is .net – it isn’t really. It counts since I have done THOUSANDS of apps and lines of code with this. Programming in Excel was the mainstay but also in powerpoint, Excel, Access etc. have been done

Excel . I would argue that there can be many times when Excel is eerily like a progammed language or a cascading series of custom formulas just like programming. Add a hint of VBA and MANY full blown applications result.

Javascript and Actionscript Everything on the web in the old days required checking – Javascript with HTML5 now also thrives on it. I have used this extensively as well. Now with jQuery as a hybrid version of web programming – Javascript is still an active language. In the world of flash – actionscript replaces javascript and many months of actionscript are also under my belt.

Perl – This was my first language that I replaced an entire administration position with staff at home over the internet. One can do almost anything with Perl and do it in such a way that no one can understand what you have done (including yourself if you go back to it in 6 months) – this is not what I have done but it shows what a powerful language it is. This is a language that glues things together and can do complex mathematical operations

Labview. Extensive automation was done with GPIB interfaced lab equipment. This is a language that you drew wires from one part to the other. Many lines were drawn to accomplish what most people do by typing.

Mathcad and Matlab. One is graphical, the other more like progamming. This was the main stay of my early career while taking my Engineering Masters at the University of Calgary and my undergraduate degree at the University of Waterloo.

Python was used for scripting on the XMBC video player as it has a python interpreter in it. This XBMC platform works on all sorts of platforms even with jailbroken apple devices so it was a perfect choice for the project. It used Python . . . so python was learned. It has a syntactically different structure with no { } but the # of tabs determines what code belongs with what. Neat.

Unix shell scripting like awk and sed. Are those languages? Sure. I have fought tenuously with XML and XSL and understand that now as well. Add Google map points with kml, facebook apps (Javascript and php), DOSSQL, wsh scripting and other proprietary languages and you will soon see – why I cannot and do not like to answer the question “so what language do you use?”

What the blog is about

This is a blog of sorts to discuss what types of interesting things one can do with technology. It is a resume of sorts, but it is also meant to inspire others that really interesting things can be done with the right tools. For each major type of platform, there will be a separate blog created. The intent is to explore and expose some items that can help the readers of this blog. You are not alone – and it can be done.

My major platforms that I will blog are

  • pure web tools and integration (php, javascript, mySQL),
  • Bonita BPM development (Bonita, Java, groovy, mySQL),
  • Filemaker integration (oh the things that can be done),
  • ZOHO development (online office suite) and
  • .net development
  • macros and scripting (Excel, Applescript, mouse control, scheduling etc.)
  • other platforms like integration into Facebook etc.

I will also have a shameless blog about the projects that we have worked on – a resume of sorts but more geared to let people know the depth and intensity if they are interested in knowing various aspects of my resume. If you want to see my resume, see it on linked-in.

Debugging Bonita – Tips and Tricks

To see the Bonita Engine Log for debugging issues:

  • Use Cygwin (a version of a unix shell in windows) http://cygwin.com/install.html
  • use the tail program with this line of code
    [groovy]
    tail /cygdrive/c/BOS-5.9/studio/workspace/.metadata/engine.log --lines=300 –follow
    [/groovy]

Insert an HTML widget for debugging

  • In the pool header – I have created a global map called “configMap” and there is an item ‘debug’
  • insert HTML widget into any page and have it shown conditionally upon configMap[‘debug’] having a true value

Writing to the Bonita Log

[groovy]

import org.slf4j.Logger;

import org.slf4j.LoggerFactory;
Logger LOGGER = LoggerFactory.getLogger(this.getClass());
LOGGER.error("Your String Here");
[/groovy]

Debugging on the Server

Here is what I do know … there is hope and things are NOT the same as on a development  computer or situation

  • The debug output from System.out.printn from Java DOES go into 1 of 3 logs (will edit this when I get paths)
  • The output from the logging routine above goes to another log (will edit this when I get paths)
  • The output from the server hosting things like Tomcat goes to a 3rd log (will edit this when I get paths)

Think customer focused when designing – work yourself out a job

If you think bottom line for your customer it will be noticed. At Telvent, I worked not only on projects but also how to make design itself better and faster so that Telvent could hire lower paid designers to use the higher paid engine design that I developed. This creates a demand for what you are doing since you are so far ahead of the curve because your designs start designing themselves that you will always be in demand. It is a win-win. Here are some other tips that keep you looking like a professional and never like a bafoon. One caution however is to market what you are doing. If you don’t the extra time you spend upfront getting the customer better and faster might make you look like you are wasting time. Show the customer how it is a win-win or a triple win which it usually is – the last win is the customer being able to design the next iteration with cheaper staff or in half the time.

At first this seems to take more time, but as you get successive wins under your belt (and hence the customer’s too) it starts to get noticed that you can do just about anything.

Here are some tips for designing in a win-win way

  • think modular design – how can what I do be a class or a reusable part. How can once piece of lego have just one more hook that I can use in the future. Watch out that the complexity stays easy to configure though
    • think toolbelt. The more tools you have the faster and more stable was the design
  • document document document your code. Blog so-to-speak what you do day to day and explain to yourself what you are trying to accomplish. In the immediate it helps you even explaining it to your customer or others who interface with what you are trying to accomplish, in the long run it helps your customer
    • for .net for example, learn the sandcastle documentation rules. It takes an afternoon and about 4 days of using it and you have all the rules in your head. Keep it up or you will start to lose it
    • learn the rules for php doc, Java doc or what ever else you are using to document your code so that a machine can cobble it together
    • document on the side with Word (words and final pics) and Powerpoint (generating pics) since they are so prolific and seemingly universal. You can use proprietary software but Work and Powerpoint are everywhere your customers are and hyperlink quite well
    • use wiki’s for communication between collegues – they can edit the doc to be even better than you alone can accomplish – the power of  collaboration
  • Use a code repository. I have used SVN, GitHub and Mercurial. Using a back up program does not cut it – I have tow of my own personal prize apps for iOS – luckly these were personal projects
    • SVN is the best for versioning binary files since you can lock others out while
    • GitHub and Mercurial are great for multisite collaboration or a regimented distribution system to the end customer. Slow and steady wins the race.
    • be verbose in your comments
    • never procrastinate with getting code into a repository and schedule time at then end of each day. Once a VM went on the fritz and many fellow coders had to redo the code – not so with my code – thank goodness I followed the golden rule
  • Never hard code – no matter what the pressure. If you do – use the well known phrase “TODO” in your code and schedule a time in the future to work this code from hard coding to dynamic design. This makes you look like your intentions were pure which they are. Document when you hard code and mention ideas to work this out to a config file or config-array in the future
  • keep todo lists
  • keep people informed of what works and what is left to go in black and white. This is where wiki’s or published excel sheets are so handy to clarify what is done and what is to go
    • write down – I repeat – write down where struggles are occurring so others can help with experience or with pulling strings to get items out of your way
    • never ever assume the other person has heard you – it takes 7 times on average before someone ‘gets what you are really saying’
    • keep track of your hours and publish/communicate them frequently
  • put larger goals aside to get quick wins for you and your customer/boss. If it is important to them, it should be important to you. Agree to disagree sometimes and get items done that satisfy others if they deem it important

These are some of the basics of programming and half has nothing to do with coding itself but the practice of communication, politics and being organized. If you practice these thing then good things will come.

Please add comments and I will work them into the text above for others.

Web Scraping – how to do it and add stability

If there is no API available to get info off of a web system – what do you do? You buy a scraping tool or … when that does not work because it can’t get past security or because it is a living website – you MAKE YOUR OWN. A young programmer asked me how to do this. Here is some advise on how-to and what not-to (do). This system that gave me most of my experience successfully scrapes gigabytes of info each night between initiating from a windows machine and uses another host on linux to do the scraping. The system babysits itself with 2 points of reference so when one machine is down – you are informed of it via email.

Idea 1- cURL (discussion follows)
Idea 2- learn how to automate internet explorer (for Windows) or Applescript (for Mac)

cURL is your best friend. It can access web pages with all sorts of security – it is free to use and integrated everywhere it seems. However don’t irritate facbook or kijiji or they will shut your account down. Now I am not myself on facebook because they think I am not really who I say I am – so I made another account and a fake name – I can’t even have a relationship on facebook with my spouse – do they have a ‘divorce app’ on facebook so I can re-marry her from one account to another? They are ok with that. DO NOT SCRAPE CERTAIN SITES – read the fine print.

Back to cURL … it is the root of the engine. Here are some tips when you go to make one

  • there are a million options and they all interact – read, re-read and re-re-read the instruction pages
  • there are lots of programming engines that make api’s. I have used cURL in php, unix command line, perl and windows environments.
  • Google is your friend
  • don’t give up – it can be done
  • save your files to local web files
    • protect these directories if sensitive info is there.
    • delete temporary files

Careful – expect things to change on the website you are changing so …

  • build in cost and expectation for your customers to pay a maintenance agreement. This money should not be considered profit – this is what you will be doing next year when the website is upgraded.
  • use generic names
  • program in blocks – DOCUMENT YOUR CODE for yourself especially (more profit!)

Processing scraped web pages.

  • HTML5 – AHHHHH! HTML5 is not XML – so if you use an XML parser and someone changes it to HTML5 – but … not really since it has lots of XHTML 4 in it . . . did I mention to give your customers the expectation of a yearly fee for this tool?
    • this is late 2012-2013 when I had to wrestle with this – programming languages have not adopted standard HTML5 libraries yet. There is only one pre-alpha library for parsing HTML5.
  • What happens if it is read and there are errors right at the borders of the info you just read?
    • build in retry loops with a slightly bigger (randomly bigger) size and try again
  • how about if the pages are not proper XML and weird things break the parser?
    • simply use a search and replace tool – keep it general with arrays of things to search and replace before it gets parsed
    • finding why XML breaks is a major pain. That is why I like XML because it is exact – play by the rules and all will work.
      • XML Validators are your friend – but don’t trust one – it might have less stringent ones and fool you in thinking all is well – when another will give you a HINT (not the answer) to the area
      • in your code – trap the error and spit out the data that is ‘offensive’ – look before and after it.
      • Do not try to debug it  using the whole file to find incorrect or invalid XML – you might need to recreate an XML header and paste most of the offensive bit out.

That is the only advise I have for now. It was a long while ago when I wrote it – it was supposed to be 30 hours – it was more like 100+ . Price carefully with lots of margin.