# Problem based on example in some of Steve Hanks' papers # BL - blemished, FL - falwed, PA painted discount: 0.95 values: reward states: NFL-NBL-NPA NFL-NBL-PA FL-NBL-PA FL-BL-NPA actions: paint inspect ship reject observations: NBL BL start: 0.5 0.0 0.0 0.5 T: paint : NFL-NBL-NPA : NFL-NBL-NPA 0.1 T: paint : NFL-NBL-NPA : NFL-NBL-PA 0.9 T: paint :NFL-NBL-PA : NFL-NBL-PA 1.0 T: paint : FL-NBL-PA : FL-NBL-PA 1.0 T: paint : FL-BL-NPA : FL-NBL-PA 0.9 T: paint :FL-BL-NPA : FL-BL-NPA 0.1 T: inspect identity T: reject : * 0.5 0.0 0.0 0.5 T: ship : * 0.5 0.0 0.0 0.5 O: paint : * : NBL 1.0 O: inspect : NFL-NBL-NPA : NBL 0.75 O: inspect : NFL-NBL-NPA : BL 0.25 O: inspect : NFL-NBL-PA : NBL 0.75 O: inspect : NFL-NBL-PA : BL 0.25 O: inspect : FL-NBL-PA : NBL 0.75 O: inspect : FL-NBL-PA : BL 0.25 O: inspect : FL-BL-NPA : NBL 0.25 O: inspect : FL-BL-NPA : BL 0.75 O: ship : * : NBL 1.0 O: reject : * : NBL 1.0 R: ship : NFL-NBL-PA : * : * 1.0 R: reject : FL-BL-NPA : * : * 1.0 # I added these when things didn't work out. R: ship : NFL-NBL-NPA : * : * -1.0 R: reject : NFL-NBL-NPA : * : * -1.0 R: reject : NFL-NBL-PA : * : * -1.0 R: ship : FL-NBL-PA : * : * -1.0 R: ship : FL-BL-NPA : * : * -1.0