[Home]History of KataPotterByEmilyBache

FrontPage | RecentChanges | Preferences


Revision 15 . . December 21, 2010 12:00 pm by Emily Bache [removing old cotent]
Revision 14 . . December 20, 2010 2:38 pm by 202.69.40.27 [The following links should be included]
Revision 13 . . (edit) March 10, 2010 9:23 am by s235-148.resnet.ucla.edu
Revision 12 . . (edit) December 3, 2009 11:25 pm by vpn-8061f549.host.ucla.edu
  

Difference (from prior major revision) (no other diffs)

Changed: 1,357c1
I have tackled KataPotter a few times, with various dojo groups, and using various programming languages. This page is about my best attempts to solve it by myself, using python. I welcome feedback and code reviews, as well as comments on the format of this article. -- EmilyBache

I decided to work with the unit tests that were helpfully outlined on the French wiki for the ParisDojo. These tests really strip the problem back to the bare essentials and help you to focus on what is important. They also lead you through the complexity of the Kata one step at a time. I found it much easier to tackle this Kata using the tests given than to start absolutely from scratch without them. However, I found a rather severe disadvantage, and that is is leads you to believe that when all the tests pass, your solution works for all cases. The first time I published a solution to this Kata, my friend AndrewDalke? pointed out a test case my code didn't handle. As well as lots of other things that could be improved. So I went back and worked out why my code didn't work, and did the Kata again. This article represents my best attempt at a solution so far, but I'm sure Andrew and others will have opinions on it. That is the best and the hardest thing with the coder's dojo; it will make me a better coder, if I can take the criticism and learn from it![Affordable Heating in Staten Island]


#potterTest.py
import unittest
from potter import *

class PotterTest(unittest.TestCase):

def testBasics(self):
self.failUnlessEqual(0, price([]))
self.failUnlessEqual(8, price([0]))
self.failUnlessEqual(8, price([1]))
self.failUnlessEqual(8, price([2]))
self.failUnlessEqual(8, price([3]))
self.failUnlessEqual(8, price([4]))
self.failUnlessEqual(8 * 2, price([0, 0]))
self.failUnlessEqual(8 * 3, price([1, 1, 1]))

(red)
This test is easy to implement:


#potter.py
def price(basket):
return 8*len(basket)

(green)
And then a simple refactoring makes it look a little better:


#potter.py
SINGLE_BOOK_PRICE = 8

def price(basket):
return SINGLE_BOOK_PRICE*len(basket)

(refactor)

And so to the second test

#potterTest.py
def testSimpleDiscounts(self):
self.failUnlessEqual(8 * 2 * 0.95, price([0, 1]))
self.failUnlessEqual(8 * 3 * 0.9, price([0, 2, 4]))
self.failUnlessEqual(8 * 4 * 0.8, price([0, 1, 2, 4]))
self.failUnlessEqual(8 * 5 * 0.75, price([0, 1, 2, 3, 4]))

(red)
Again, this is fairly easy to implement, since we only ever have to deal with one set of books.

SINGLE_BOOK_PRICE = 8

def price(basket):
book_set = set(basket)
discount_multiplier = 1
if len(book_set) == 2:
discount_multiplier = 0.95
elif len(book_set) == 3:
discount_multiplier = 0.9
elif len(book_set) == 4:
discount_multiplier = 0.8
elif len(book_set) == 5:
discount_multiplier = 0.75
return SINGLE_BOOK_PRICE*len(basket)*discount_multiplier

(green)
The switch statement is unweildy, though. I noiced that since each case is an integer, you can use an array to enumerate the cases and just index them.


#potter.py
SINGLE_BOOK_PRICE = 8
DISCOUNT_MULTIPLIER = [1, 1, 0.95, 0.9, 0.8, 0.75]

def price(basket):
book_set = set(basket)
return SINGLE_BOOK_PRICE*len(basket)*DISCOUNT_MULTIPLIER[len(book_set)]

(refactored)

So far so good. Next test.

#potterTest - add third test
def testSeveralDiscounts(self):
self.failUnlessEqual(8 + (8 * 2 * 0.95), price([0, 0, 1]))
self.failUnlessEqual(2 * (8 * 2 * 0.95), price([0, 0, 1, 1]))
self.failUnlessEqual((8 * 4 * 0.8) + (8 * 2 * 0.95), price([0, 0, 1, 2, 2, 3]))
self.failUnlessEqual(8 + (8 * 5 * 0.75), price([0, 1, 1, 2, 3, 4]))

(red)
Now we have to deal with baskets containing several book sets, and this means we have to partition the basket of books into sets. In my earlier attempts at this kata I took each book in the order it appeared, until I found it was smarter to sort them so the most frequently occurring books come first. Then you can also initialize the book sets up front, because you know how many there will be. So I first implement a method to find the frequency of each book, and return it as a sorted list of tuples.


#potterTest.py
def test_sort_by_frequency(self):
self.failUnlessEqual([(1, 0)], book_frequencies([0,]))
self.failUnlessEqual([(2, 1), (1, 0)], book_frequencies([0,1,1]))

#potter.py
def book_frequencies(books):
counts = dict.fromkeys(range(5), 0)
for book in books:
counts[book] += 1
counts_inverted = [(count, book) for book, count in counts.items()]
counts_inverted.sort()
counts_inverted.reverse()
return counts_inverted


So now we can go through and just assign books to booksets


#potterTest.py
def test_assign_to_booksets(self):
self.failUnlessEqual([[0]], assign_booksets( [(1, 0)] ))
self.failUnlessEqual([[1, 0], [1]], assign_booksets( [(2, 1), (1, 0)] ))

#potter.py
def assign_booksets(book_frequencies):
max_sets = book_frequencies[0][0]
book_sets = [[] for i in range(max_sets)]
for count, book in book_frequencies:
for i in range(count):
book_sets[i].append(book)
return book_sets


And then we just have to string these two methods together and add some code to calculate the total price of the list of booksets.


#potter.py
def price(books):
if not books:
return 0
frequencies = book_frequencies(books)
book_sets = assign_booksets(frequencies)

total_price = 0
for book_set in book_sets:
size = len(book_set)
total_price += SINGLE_BOOK_PRICE * size * DISCOUNT_MULTIPLIER[size]
return total_price


(green)

So now we have passed 3 of the four acceptance tests. I think it's probably worth pulling that last part out into a method for calculating the price of a list of booksets.


#potter.py
def price_of_booksets(book_sets):
total_price = 0
for book_set in book_sets:
size = len(book_set)
total_price += SINGLE_BOOK_PRICE * size * DISCOUNT_MULTIPLIER[size]
return total_price

def price(books):
if not books:
return 0
frequencies = book_frequencies(books)
book_sets = assign_booksets(frequencies)
total_price = price_of_booksets(book_sets)
return total_price

(refactored)

And so on to the last test:


def testEdgeCases(self):
self.failUnlessEqual(2 * (8 * 4 * 0.8), price([0, 0, 1, 1, 2, 2, 3, 4]))
self.failUnlessEqual(3 * (8 * 5 * 0.75) + 2 * (8 * 4 * 0.8),
price([0, 0, 0, 0, 0,
1, 1, 1, 1, 1,
2, 2, 2, 2,
3, 3, 3, 3, 3,
4, 4, 4, 4]))
self.failUnlessAlmostEqual((8 * 2 * 0.95) + 2 * (8 * 3 * 0.9) + 2 * (8 * 4 * 0.8),
price([0,
1,
2, 2, 2, 2, 2,
3, 3, 3, 3, 3,
4, 4, 4, 4]),
2)

(red)

The last acceptance test demands that we determine the cheapest partitioning of books into sets, not just the one that creates the greatest number of big sets. Because of a quirk of the pricing rules, 2 sets of four is cheaper than a set of five and a set of three.

So looking at my code, the method that needs to change is assign_booksets. It needs to take into accout the quirk in the pricing. One way forward is just to put in a check that makes sure that instead of adding a 5th book to a set of 4, we instead look to see if there is a set of 3 we could add it to to make a set of 4. This feels like a fragile solution to me. At present the only part of my code that knows there are 5 books in a set is the DISCOUNT_MULTIPLIER. What happens if I add another couple of books to the series? What will the publisher decide to do about the discounting scheme? Guessing what the customer will do next is a dangerous business. Classic TDD philosophy says to solve only today's problem and trust we can solve tomorrow's problem tomorrow.

I had a discussion about this with a friend of mine, AndrewDalke?, and he convinced me that it is fruitless to try to write code that is general enough to deal with every pricing scheme the publisher might come up with in future. You end up writing a whole general optimizer basically. And even then the publisher can probably come up with something to fox it. So I decided to go with the potentially fragile solution. Here is what I came up with.


#potter.py
def assign_booksets(book_frequencies):
max_sets = book_frequencies[0][0]
log("frequencies of books %s implies we need %s sets" % (book_frequencies, max_sets))
book_sets = [[] for i in range(max_sets)]
for count, book in book_frequencies:
for i in range(count):
# if we are about to create a set of 5, see if there is a set of 3 coming up that we could add it to instead.
if len(book_sets[i]) == 4:
if 3 in [len(book_set) for book_set in book_sets[i:]]:
i += 1
# don't add the book where it is already placed, keep looking.
# there *will* be a space for it, since we know
# we have initialized enough booksets for all the books.
while book in book_sets[i]:
i += 1
book_sets[i].append(book)
return book_sets

(green)

The fact that I needed to add several comments there to explain what this code is doing is a bit of a smell. I think a little refactoring might make this code clearer. I extracted a method, and changed it around a bit in a way that I think is clearer, but less efficient since it loops through book_sets more times. I felt the increased readability justified it. Here is the final tests and code:


#potterTest.py
import unittest
from potter import *

class PotterTest(unittest.TestCase):

def testBasics(self):
self.failUnlessEqual(0, price([]))
self.failUnlessEqual(8, price([0]))
self.failUnlessEqual(8, price([1]))
self.failUnlessEqual(8, price([2]))
self.failUnlessEqual(8, price([3]))
self.failUnlessEqual(8, price([4]))
self.failUnlessEqual(8 * 2, price([0, 0]))
self.failUnlessEqual(8 * 3, price([1, 1, 1]))

def testSimpleDiscounts(self):
self.failUnlessEqual(8 * 2 * 0.95, price([0, 1]))
self.failUnlessEqual(8 * 3 * 0.9, price([0, 2, 4]))
self.failUnlessEqual(8 * 4 * 0.8, price([0, 1, 2, 4]))
self.failUnlessEqual(8 * 5 * 0.75, price([0, 1, 2, 3, 4]))

def testSeveralDiscounts(self):
self.failUnlessEqual(8 + (8 * 2 * 0.95), price([0, 0, 1]))
self.failUnlessEqual(2 * (8 * 2 * 0.95), price([0, 0, 1, 1]))
self.failUnlessEqual((8 * 4 * 0.8) + (8 * 2 * 0.95), price([0, 0, 1, 2, 2, 3]))
self.failUnlessEqual(8 + (8 * 5 * 0.75), price([0, 1, 1, 2, 3, 4]))

def test_sort_by_frequency(self):
self.failUnlessEqual([(1, 0)], book_frequencies([0]))
self.failUnlessEqual([(2, 1), (1, 0)], book_frequencies([0,1,1]))

def test_assign_to_booksets(self):
self.failUnlessEqual([[0]], assign_booksets( [(1, 0)] ))
self.failUnlessEqual([[1, 0], [1]], assign_booksets( [(2, 1), (1, 0)] ))


def testEdgeCases(self):
self.failUnlessEqual(2 * (8 * 4 * 0.8), price([0, 0, 1, 1, 2, 2, 3, 4]))
self.failUnlessEqual(3 * (8 * 5 * 0.75) + 2 * (8 * 4 * 0.8),
price([0, 0, 0, 0, 0,
1, 1, 1, 1, 1,
2, 2, 2, 2,
3, 3, 3, 3, 3,
4, 4, 4, 4]))
self.failUnlessAlmostEqual((8 * 2 * 0.95) + 2 * (8 * 3 * 0.9) + 2 * (8 * 4 * 0.8),
price([0,
1,
2, 2, 2, 2, 2,
3, 3, 3, 3, 3,
4, 4, 4, 4]),
2)

# potter.py
import sys

LOGGING = False

SINGLE_BOOK_PRICE = 8
# WARNING TO FUTURE DEVELOPERS
# this code is optimized to calculate the correct price for
# precisely these discounts. If this variable is changed, you
# will need to change the algorithm in find_best_bookset too.
DISCOUNT_MULTIPLIER = [1, 1, 0.95, 0.9, 0.8, 0.75]

def log(msg):
if LOGGING:
print >> sys.stderr, msg

def book_frequencies(books):
counts = 5 * [0]
for book in books:
counts[book] += 1
counts_inverted = [(count, book) for book, count in enumerate(counts)]
counts_inverted.sort()
counts_inverted.reverse()
return counts_inverted

def assign_booksets(book_frequencies):
max_sets = book_frequencies[0][0]
log("frequencies of books %s implies we need %s sets" % (book_frequencies, max_sets))
book_sets = [[] for i in range(max_sets)]
for count, book in book_frequencies:
for i in range(count):
best_bookset = find_best_bookset(book_sets, book)
best_bookset.append(book)
return book_sets

def find_best_bookset(book_sets, book):
for book_set in book_sets:
if book in book_set:
continue
# if we are about to create a set of 5,
# see if there is a set of 3 that we could add it to instead.
if len(book_set) == 4:
sets_of_3s = [b_set for b_set in book_sets
if (len(b_set) == 3 and book not in b_set)]
if sets_of_3s:
log("found set of 3 %s to add book %s to, will not add to set of 4 %s"
% (sets_of_3s, book, book_set))
return sets_of_3s[0]
return book_set

def price_of_booksets(book_sets):
total_price = 0
for book_set in book_sets:
size = len(book_set)
assert size < len(DISCOUNT_MULTIPLIER), "Bookset is too big for current pricing scheme. Current pricing scheme only allows booksets max size 5, you have %s" % size
total_price += SINGLE_BOOK_PRICE * size * DISCOUNT_MULTIPLIER[size]
return total_price

def price(books):
if not books:
return 0
log("will calculate price for books %s" % books)
frequencies = book_frequencies(books)

book_sets = assign_booksets(frequencies)
log("assigned books to sets %s" % book_sets)

total_price = price_of_booksets(book_sets)
return total_price

if __name__ == "__main__":
LOGGING = True
for line in sys.stdin:
print price(eval(line))


***Note. [Paul Phifer] is currently working on an alternative python fix for KataPotter in addition to Emily's, but will use semantic mapping and logic to solve it. In the intermin, you can contact Paul about the project if you have any questions regarding the use of python and manually executed php scripts.

* [Dissertation]
* [Buy Dissertation]
* [Essay Help]
* [Thesis Help]
* [Buy Thesis]
please remove this page

FrontPage | RecentChanges | Preferences
Search: