Help:Sequence Analysis:Software Design

Revision as of 11:36, 9 June 2008 by Randy (Talk | contribs) (New page: ==Software design 6-6-2008== I now have the sequence analysis software working much better. - Edits go in and are saved. - Comments are saved. - The user can specify the result of the an...)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Software design 6-6-2008

I now have the sequence analysis software working much better.

- Edits go in and are saved. - Comments are saved. - The user can specify the result of the analysis. - In the error section, it tells you when the read is beyond the 900 bp limit.

Note: deleting a base works. Changing a base works. To insert, just change a base to more than one base.

I have gone through a bunch of parts in plate 1000 with no errors.

After examining and editing some sequences, I found that most of the parts were good after all. Some were clearly wrong.

However, while it was possible to find that parts were good, it was not easy. It was necessary to use all the information available. For example, in one part, the first 800 bases were easy, but I had to look at the "blast against part" for both directions to see that all the bases were well covered.

One part had been processed by Long Read. This fixed up an otherwise bad reading.

I noticed that some of the Phred trace files are missing. We should find out why, but this is not urgent.


Notes for the next design

Having done this version of the software, the design for the next version is cleared. Perhaps this can be done in the late summer or early fall. Here are some changes that should be made.

1. Use the Phred data. (However, some users will only have machine called sequences. Perhaps we can run their raw data through Phred for them.)

2. The Long reading program did a better job of calling bases than the machine when the quality was low. We need to see if this is generally true.

3. The current software makes a single alignment of the reading to the part. An insertion or deletion is not dealt with at all. This is probably the largest issue with the program. A new version needs to provide a print-out like the 2-sequence blast so that you can easily see all of the alignment between the part and all the sequences.

4. The editing must be like normal WYSIWYG editing. Drag across the sequence and type.

5. Internally, the software needs a new data structure to deal with all these changes.

6. We should be able to see the electrophorogram on the web site all lined up with the part and the sequences.

It may be possible that some other program can take our information, let the user do a good job, and then dump its modified document back in the Registry so that others can see what was done. (seems unlikely)

Other comments?

Randy