'개발/python' 카테고리의 글 목록

SW공학센터 강의

개발/python 2012. 4. 28. 23:34

4월26일 목요일에 월차내고 SW공학센터에 다녀왔습니다.

아래 사진은 광역 슬립 스킬을 시전하고 있는 모습입니다.

이런 대규모 강의에 익숙하지 않다보니 본의아니게 졸립게 만들어 들인듯 합니다. 아니면 전날 디아블로3 클로즈베타 시작일이라서 그런것이라 위로를...

ps. 이런 자리를 만들어준 강승준 책임, 땡큐~

AND

파이썬 강의+자문

개발/python 2012. 3. 29. 00:10

며칠 전, 강의+자문 요청이 들어왔다.

그 업체에서 자사 엔진 위에 사용자 변경이 가능한 script를 올리고 싶은데, script 언어들에 대한 대략적인 소개 및 비교를 하고, 요구사항들에 대한 가능성, 디자인 등에 대한 조언을 요청한 것이었다.

남 앞에 서는것도 별로 안 좋아하고 방콕 스타일인 나에겐 일상적이지 않은 일이기에 심사숙고를 하다가, 진행하기로 결정하고 오늘 다녀왔다.

첫번째 시간은 2시간 동안 진행되는 강의세션, 두번째 시간은 2시간 동안 해당 프로젝트에 대한 논의 세션으로 진행했다. 개인적으론, 양쪽 모두 원하는 바를 얻은 유익한 시간이었다고 생각하고 있는 중이다... :)

아래 자료는 내가 맡은 부분들에 대한 발표자료들이다. 언젠가 쓸모 있지 않을까 해서 slideshare에 올려 놓았다 ㅎㅎ

Python3 brief summary

View more PowerPoint from HoChul Shin

Lua vs python

View more PowerPoint from HoChul Shin

ps. C binding쪽의 발표를 맡은 최동진선임도 밤 늦게까지 수고 많이 하셨어요.

AND

[정규식] greedy vs lazy

개발/python 2010. 7. 3. 13:14

간만에 정규식에 대해서 정리할 일이 생겨서 블로그에도 정리해 봅니다. :)

정규식에는 greedy한 방식과 lazy(non-greedy)한 방식이 있습니다.

일예로 ".*"의 경우가 greedy한 방식입니다. 즉, 조건을 만족하는 한 가장 긴 문자열을 선택하려고 합니다.
그리고 반대의 경우, ".*?"의 경우가 lazy한 방식입니다. 조건을 만족하는 한 가장 짧은 문자열을 선택하려고 합니다.

음, 저도 써 놓고 보니 무슨 소리인지 모르겠네요 :)
예제를 살짝 바꿔서 돌려보면 차이를 이해하실 수 있을 것입니다.

아래와 같이 "<"과 ">"로 둘러싼 문자열을 greedy하게 선택해 보겠습니다. (@python 2.6)

>>> import urllib
>>> html = urllib.urlopen('http://www.python.org').read()
>>> import re
>>> greedy = re.compile(r'<.*>', re.I|re.S)
>>> len( greedy.findall(html) )
1

엇, 결과 개수가 1개네요. 즉, <>로 둘러싼 가장 긴 문자열을 선택하려고 했으니 html 태그 전체를 선택하게 된 것을 보
실 수 있습니다.
이번에는 살짝 바꿔서 lazy하게 선택해 보겠습니다.

>>> lazy = re.compile(r'<.*?>', re.I|re.S)
>>> len ( lazy.findall(html) )

469

네, 이번에는 <>로 둘러싼 짧은 문자열을 선택했으니, 모든 태그들을 선택하게 되었습니다.

원문 : http://groups.google.com/group/python3/browse_thread/thread/c6c40a3738818d77

AND

요즘 제가 RSS보다 트위터를 자주 보는거 같아서, 만화전용 트윗봇을 런칭했습니다 --;
게다가 자주가던 번역 블로그들마저, 저작권에 걸려 추풍낙엽으로 떨어져 나가더라구요.

수집 대상 만화는 강철의 연금술사, 클레이모어, 나루토, 블리치, 헌터x헌터, 원피스입니다.
강철의 연금술사, 클레이모어는 1달에 1번 나오고
나루토, 블리치, 헌터x헌터, 원피스는 매주 금요일 정도에 나오네요.

이 만화들에 관심있으시면 팔로우 하세요~
http://twitter.com/hotmanga

다만 단점은 영문이라는 점과 만화선정이 제맘이라는... --;
감사합니다.

AND

Train Timetable

개발/python 2009. 8. 22. 01:03

이번에는 두번째 문제인 기차시간표입니다.

문제는 다음과 같습니다.

Problem

A train line has two stations on it, A and B. Trains can take trips from A to B or from B to A multiple times during a day. When a train arrives at B from A (or arrives at A from B), it needs a certain amount of time before it is ready to take the return journey - this is the turnaround time. For example, if a train arrives at 12:00 and the turnaround time is 0 minutes, it can leave immediately, at 12:00.

A train timetable specifies departure and arrival time of all trips between A and B. The train company needs to know how many trains have to start the day at A and B in order to make the timetable work: whenever a train is supposed to leave A or B, there must actually be one there ready to go. There are passing sections on the track, so trains don't necessarily arrive in the same order that they leave. Trains may not travel on trips that do not appear on the schedule.

Input

The first line of input gives the number of cases, N. N test cases follow.

Each case contains a number of lines. The first line is the turnaround time, T, in minutes. The next line has two numbers on it, NA and NB. NA is the number of trips from A to B, and NB is the number of trips from B to A. Then there are NA lines giving the details of the trips from A to B.

Each line contains two fields, giving the HH:MM departure and arrival time for that trip. The departure time for each trip will be earlier than the arrival time. All arrivals and departures occur on the same day. The trips may appear in any order - they are not necessarily sorted by time. The hour and minute values are both two digits, zero-padded, and are on a 24-hour clock (00:00 through 23:59).

After these NA lines, there are NB lines giving the departure and arrival times for the trips from B to A.

Output

For each test case, output one line containing "Case #x: " followed by the number of trains that must start at A and the number of trains that must start at B.

Limits

1 ≤ N ≤ 100

Small dataset

0 ≤ NA, NB ≤ 20

0 ≤ T ≤ 5

Large dataset

0 ≤ NA, NB ≤ 100

0 ≤ T ≤ 60

제일 가까운 기차들을 모두 이어주고,
앞뒤로 이어지지 않은 기차들을 카운트해주면 됩니다.

소스코드는 다음과 같습니다.

import sys
import datetime
 
class train:
  def __init__(self, row, dir):
    st = row.split(' ')[0]
    end = row.split(' ')[1]    
    self.start = datetime.datetime( 2009,1,1, int(st.split(':')[0]), int(st.split(':')[1]) )
    self.end = datetime.datetime( 2009,1,1, int(end.split(':')[0]), int(end.split(':')[1]) )
    self.dir = dir
    self.reserved1 = 0
    self.reserved2 = 0
 
def asc_start(a,b):
  if a.start < b.start:
    return -1
  else:
    return 1
 
def desc_end(a,b):
  if a.end < b.end:
    return 1
  else:
    return -1
 
if __name__=="__main__":
  if len(sys.argv)>1:
    inp = sys.argv[1]
  else:
    print "append an input file param"
    sys.exit()
 
  f = open( inp, 'rt' )
  nTC = int( f.readline() )
  print 'the number of tc :', nTC
  
  output_file = inp.split('.')[0]+"_output.txt"
  fout = open( output_file, 'wt')
  
  for i in range(0,nTC):
    turnaround = int( f.readline() )
    
    NA_NB = f.readline()
    NA = int( NA_NB.split(' ')[0] )
    NB = int( NA_NB.split(' ')[1] )
    
    trains = []
    for n in range(0,NA):
      row = f.readline()
      trains.append( train( row, 'A' ) )
    for n in range(0,NB):
      row = f.readline()
      trains.append( train( row,'B') )
 
    trains.sort(desc_end)
    
    #process
    ta = datetime.timedelta( 0, turnaround*60 )
    
    for t1 in trains:
      if t1.reserved2>0: continue
      cands = []
      for t2 in trains:
        if t2.reserved1>0: continue
          
        if ( t1.end + ta <= t2.start ) and (t1.dir != t2.dir):
          cands.append( t2 )
          
      if cands:
        cands.sort( asc_start )
        min = cands[0]
        #print t1.start, t1.end, t1.dir
        #print '\t',min.start, min.end, min.dir
        
        t1.reserved2 = min.reserved1 = 1
 
    NA = NB = 0
    for t in trains:
      if t.reserved1==0:
        if t.dir=='A':
          NA+=1
        else:
          NB+=1
 
    # result
    print 'Case #'+str(i+1)+':' ,NA, NB
    fout.write( 'Case #'+str(i+1)+': ' +str(NA) + ' '+str( NB)+'\n')
  
  f.close()
  fout.close()

실행결과는 다음과 같으며, 모두 Correct입니다.

AND

Saving the Universe

개발/python 2009. 8. 21. 01:48

요즘 기분이 꿀꿀해서 CodeJam 2008 Qualification Round 문제 중 하나인, 'Saving the Universe'를 풀어봤습니다.

1번 문제(Saving the Universe)는 다음과 같습니다.

Problem

The urban legend goes that if you go to the Google homepage and search for "Google", the universe will implode. We have a secret to share... It is true! Please don't try it, or tell anyone. All right, maybe not. We are just kidding.

The same is not true for a universe far far away. In that universe, if you search on any search engine for that search engine's name, the universe does implode!

To combat this, people came up with an interesting solution. All queries are pooled together. They are passed to a central system that decides which query goes to which search engine. The central system sends a series of queries to one search engine, and can switch to another at any time. Queries must be processed in the order they're received. The central system must never send a query to a search engine whose name matches the query. In order to reduce costs, the number of switches should be minimized.

Your task is to tell us how many times the central system will have to switch between search engines, assuming that we program it optimally.

Input

The first line of the input file contains the number of cases, N. N test cases follow.

Each case starts with the number S -- the number of search engines. The next S lines each contain the name of a search engine. Each search engine name is no more than one hundred characters long and contains only uppercase letters, lowercase letters, spaces, and numbers. There will not be two search engines with the same name.

The following line contains a number Q -- the number of incoming queries. The next Q lines will each contain a query. Each query will be the name of a search engine in the case.

Output

For each input case, you should output:

Case #X: Y

where X is the number of the test case and Y is the number of search engine switches. Do not count the initial choice of a search engine as a switch.

Limits

0 < N ≤ 20

Small dataset

2 ≤ S ≤ 10

0 ≤ Q ≤ 100

Large dataset

2 ≤ S ≤ 100

0 ≤ Q ≤ 1000

문제링크 : http://code.google.com/codejam/contest/dashboard?c=agxjb2RlamFtLXByb2RyEAsSCGNvbnRlc3RzGI36AQw#

그냥 간단하게 검색엔진 list를 둬서 카운트 하다가 꽉차면 다시 비우고를 반복하면 끝입니다.
파이썬 코드는 다음과 같습니다. (2.5.4버전으로 했음)

import sys
 
def isAllChecked(d):
  ret = True
  for k in d.keys():
    if d[k]<1:
      ret = False
  return ret
 
if __name__=="__main__":
  if len(sys.argv)>1:
    inp = sys.argv[1]
  else:
    print "append an input file param"
    sys.exit()
 
  f = open( inp, 'rt' )
  nTC = int( f.readline() )
  print 'the number of tc :', nTC
  
  output_file = inp.split('.')[0]+"_output.txt"
  fout = open( output_file, 'wt')
  
  for i in range(0,nTC):
    nEngine = int( f.readline() )
    engines = []
    for e in range(0,nEngine):
      engine = f.readline()
      engines.append( engine )
    nKeyword = int( f.readline() )
    keywords = []
    for k in range(0,nKeyword):
      keyword = f.readline()
      keywords.append( keyword )
    
    # process
    dEngine = {}
    for e in engines:
      dEngine[e]=0
    nCount = 0
    for k in keywords:
      if k in dEngine.keys():
        dEngine[k] += 1
        if isAllChecked(dEngine):
          for e in engines:
            dEngine[e] = 0
          nCount += 1
          dEngine[k] += 1
          
    # result
    print 'Case #' + str(i+1)+': '+str( nCount )
    fout.write( 'Case #' + str(i+1)+': '+str( nCount ) + '\n' )
      
  f.close()
  fout.close()

실행결과는 다음과 같으며, 둘다 Correct 입니다.

AND

회사에서 트위터 몰래하기

개발/python 2009. 7. 12. 00:16

요즘 뜨는 SNS인 트위터를 회사에서 모니터링하기에는 너무 눈치가 보여서, 브라우저 플러그인이나 독립어플이 아닌 커멘트라인으로 어떻게 바꿀 수 있을까 생각 중이었는데요. 트위터는 open api가 잘 되어있고, python-twitter 모듈이 쉽게 정리되어 있어서 구현이 정말 쉽더군요.

특히 왜 이렇게 스팸들이 많은지 이해할 수 있었습니다. 모니터링하면서 following하고 스팸 뿌리기가 너무 쉽네요 --;

뭐, 하여간 회사에서 약간 투명하게 커맨드라인 설정해 놓고 쓰기 적당하게 대충 짜 보았습니다. 소스 링크는 다음과 같습니다.
[peeping_tweet.zip]
소스는 보시다시피 거의 python-twitter 그대로입니다 --;

peepingtweet.py를 열어보시면 소스 상단에,

login_id = ""
login_password = ""

과 같이 되어있는데요.
이렇게 되어 있는 경우에는 아래 스샷과 같이 public tweet들이 보이게 됩니다,

자신의 친구들 글을 보려면 login_id와 login_password에 적당한 값을 넣으면 아래 스샷과 같이 제대로 나오게 되는 것을 확인할 수 있습니다.

회사에서는 우분투에서 돌려야지... --;

ps1. python 2.5 기반입니다.
그러므로 혹시나 파이썬이 안깔려있다면 까셔야합니다.
http://python.org/download/releases/2.5.4/

ps2. py2exe는 패스...
아, 그리고 이건 모니터링 온리입니다. 글 올리는 기능은 없습니다.
물론 넣기는 쉽겠지만, 제가 그러한 요구사항을 못 느껴서요 ...
필요하신 분들은 걍 추가해서 사용하세요...

AND

yes24 판매량 체크 프로그램

개발/python 2009. 7. 7. 00:35

yes24 판매지수의 로그가 남지 않아서,
판매량의 변화를 알고 싶어서 간단하게 짰습니다.

예약작업이나 시작프로그램에 넣으시거나, crontab 등에 등록해서 돌리면 됩니다.
하루에 한번만 실행되며, 레코드가 이미 존재하는 경우에는 다시 쓰지 않습니다.
sqlite3를 사용하므로, 데이터확인은 직접 콘솔로 확인하셔도 되고, db 클래스의 확인용 함수를 사용하셔도 됩니다. 저는 아래 그림과 같이 firefox용 sqlite manager를 써서 확인합니다.

소스코드는 다음과 같습니다.
수집대상을 바꾸려면 초반의 books 사전정보를 변경하시면 됩니다.

import urllib2, time, traceback
from BeautifulSoup import BeautifulSoup
import sqlite3

books = {
'python':'http://www.yes24.com/24/goods/3432490',
'lua':'http://www.yes24.com/24/goods/3081202'
}

def getContent( url ):
req = urllib2.Request( url )
response = urllib2.urlopen(req)
return response.read()

class DB:
"SQLITE3 wrapper class"
def __init__(self):
self.conn = sqlite3.connect('bookDB')
self.cursor = self.conn.cursor()
for title in books.keys():
self.cursor.execute('CREATE TABLE IF NOT EXISTS %s(date text, sale int)'%title)
self.cursor.execute('CREATE UNIQUE INDEX IF NOT EXISTS IDX001 ON %s(date)'%title)

def __del__(self):
self.conn.commit()
self.cursor.close()

def insertPython(self, title, date, sale):
try:
self.cursor.execute("INSERT INTO %s VALUES ('%s',%d)"%(title,date,sale))
except:
print '%s : maybe already inserted'%title
return 0
else:
print '%s: success'%title
return 1

def printPythonResult(self, title):
self.cursor.execute('SELECT * FROM %s ORDER BY date ASC'%title)
for row in self.cursor.fetchall():
print row[0],'\t', row[1]

def printPythonResult(self, title, num):
self.cursor.execute('SELECT * FROM %s ORDER BY date DESC LIMIT %d'%(title,num))
for row in self.cursor.fetchall():
print row[0],'\t', row[1]

db = DB()

if __name__ == "__main__":
curtime = time.localtime()
curday = "%d/%02d/%02d"%(curtime[0],curtime[1],curtime[2])

for title,url in books.items():
content = getContent( url )
soup = BeautifulSoup( content )

a = soup('dt', {'class':'saleNum'})
salenum = -1
if len(a)>0:
try:
text = str( a[0].contents[0] ).split('|')[1]
#print text
splited = text.split(' ')
for s in splited:
if s.isdigit():
salenum = int(s)
break
except:
traceback.print_exc()

print title, ': try to insert :',curday, salenum
db.insertPython( title, curday, salenum )

print title, ': === recent 10 sale points ==='
db.printPythonResult( title, 10 )

time.sleep(5) # for reading results....

파일 다운로드 : [ salepoint_checker.py ]

ps. python 2.5 기반입니다.

AND

파이썬으로 RSS Feed generator 만들기 - 2/2

개발/python 2009. 5. 10. 07:03

4. Rss.py는 다음과 같습니다.
mod_python으로 연결해 놓으시면 됩니다.

# -*- coding: utf-8 -*-

from mod_python import apache
import pickle, re
import os.path, time

url_head = "http://asialadders.battle.net/war3/ladder/W3XP-player-profile.aspx?Gateway=Kalimdor&&PlayerName="

def conv( date ):
ds = date.split(',')
ds1 = ds[1].split(' ')
ds2 = ds[2].split(' ')
date = ds[0][:3]+', '+ds1[2]+' '+ds1[1][:3]+' '+ds2[1]+' '+ds2[2]+' '+ds2[3]
return date
def getInfo( ):
f = open('/var/www/war3/info', 'rb')
info = pickle.load( f )
f.close()
return info

def handler(req):
req.content_type="Text/xml"
req.send_http_header()

t = os.path.getctime('/var/www/war3/info')
pubdate = time.asctime( time.gmtime(t) )

body ="""<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
<channel>
<title>Frozen Throne - Kalimdor Tracer</title>
<link>http://cybershin.x-y.net/tt/</link>
<description>누가누가 달렸나 모니터링 시스템</description>
<language>ko</language>
<pubDate>%s</pubDate>
<generator>dsp generator</generator>""" %pubdate
footer = """
</channel>
</rss>"""
info = getInfo()
for i in info:
user, level, date = i
date = conv(date)
level = level.replace('l','l ')
link = url_head + user
body += """
<item>
<title>%s</title>
<link>%s</link>
<description>%s, Last Ladder Game : %s</description>
<author>(%s)</author>
<guid>%s#%s</guid>
<pubDate>%s</pubDate>
</item>"""%(user, link, level, date, user, link, re.sub(' ','',date), date )
# Sat, 18 Apr 2009 00:15:00 +0900
body += footer
req.write( body )
return apache.OK

5. 성공적으로 실행되면 브라우저상에서 다음과 같이 잘 출력되는 것을 볼 수 있으며,
hanrss, outlook 등으로 연결해서도 잘 되는 것을 확인할 수 있습니다.

Frozen Throne - Kalimdor Tracer
누가누가 달렸나 모니터링 시스템

alpakook
Level 12, Last Ladder Game : Sat, 09 May 2009 1:54 AM
dspshin
Level 6, Last Ladder Game : Fri, 08 May 2009 12:40 AM
soudz
Level 22, Last Ladder Game : Thu, 23 Apr 2009 12:58 AM
milkelf
Level 9, Last Ladder Game : Sun, 03 May 2009 8:20 PM
sacrea
Level 2, Last Ladder Game : Sun, 12 Apr 2009 8:31 PM
again4you
Level 1, Last Ladder Game : Tue, 05 May 2009 6:49 PM

AND

파이썬으로 RSS Feed generator 만들기 - 1/2

개발/python 2009. 5. 9. 17:03

이번 예제는 친구들이 워3를 했나안했나 체크해서,
rss로 제공해 주는 rss feed generator입니다. ㅎㅎ

1.
저는 Python으로 할 것이므로 mod_python을 설치/설정합니다.
당연히 PHP등 다른 언어 사용해도 무방.
> 방법 : http://cybershin.x-y.net/tt/188 참고.

2.
이 예제는 특정 사이트를 모니터링하며 해당 정보가 업데이트되면 RSS로 알려주는 예제입니다.
고로 Request가 올때마다 특정 사이트를 읽어서 답하면 너무 늦으므로 프로세스를 2개로 나눠서 실행합니다.
즉, 정보 수집을 맡는 Crawl.py 와 RSS결과를 반환해 주는 rss.py로 분리.

3. Crawl.py는 다음과 같습니다.
crontab 등으로 하루에 몇번만 실행시키면 됩니다.

#!/usr/bin/python
# -*- coding: utf-8 -*-

import urllib2, re, pickle, sys, time
from BeautifulSoup import BeautifulSoup

users = [
'alpakook', 'dspshin', 'soudz', 'milkelf', 'sacrea','again4you'
]

url_head = "http://asialadders.battle.net/war3/ladder/W3XP-player-profile.aspx?Gateway=Kalimdor&PlayerName="

def getInfo( user ):
url = url_head + user
contents = urllib2.urlopen(url).read()
soup = BeautifulSoup( contents )
B = soup('b', {'class':'small'})
date = ''
for b in B:
if str(b).find(':')>-1:
date = b.contents[0].strip().encode('ascii')

level = ''
lv = soup('div', {'style':"Z-INDEX: 200; LEFT: 75px; POSITION: relative; TOP: -25px"})
if len(lv)>0:
body = str(lv[0])
sp = body.find('Level')
ep = body[sp:].find('<')
level = body[sp:sp+ep].strip()
level = re.sub('\s', '', level)

return level, date

if __name__=="__main__":
print sys.version
print 'run crawl.py : '+ time.strftime("%B %dth %A %I:%M ",time.localtime())
info = []
for user in users:
try:
level, date = getInfo(user)
except:
print sys.exc_info()
else:
print user, level, date
info.append( (user, level, date) )

#print info
f = open('/var/www/war3/info', 'wb')
pickle.dump( info, f )
f.close()

>>> 나머지는 다음 글에...

AND

BLOG ARTICLE 개발/python | 30 ARTICLE FOUND

SW공학센터 강의

파이썬 강의+자문

[정규식] greedy vs lazy

만화 트윗봇

Train Timetable

Saving the Universe

회사에서 트위터 몰래하기

yes24 판매량 체크 프로그램

파이썬으로 RSS Feed generator 만들기 - 2/2

파이썬으로 RSS Feed generator 만들기 - 1/2

Dsp Profile [linkedin]

ARTICLE CATEGORY

ARCHIVE & SEARCH

CALENDAR

RECENT ARTICLE

TAG CLOUD

RECENT COMMENT

RECENT TRACKBACK

MY LINK

COUNTER

티스토리툴바